From: Mohiyuddin on
hi,
i want to use the classification and regression tree algorithm to classify some data which i have collected. I came across the command
t=classregtree(X,y)
in matlab where i understood that X is nothing but the matrix which contains the data which has to be classified. In matlab help its mentioned that "If y is a categorical variable, character array, or cell array of strings, classregtree performs classification",
but i am unable to understand how to create this cell array y i.e on what basis are the contents of y choosen?
Please somebody help me out, its urgent cause this is my final year project in engg and i have got just two weeks left.
From: Wayne King on
"Mohiyuddin " <mbyadgi(a)gmail.com> wrote in message <hr9549$r65$1(a)fred.mathworks.com>...
> hi,
> i want to use the classification and regression tree algorithm to classify some data which i have collected. I came across the command
> t=classregtree(X,y)
> in matlab where i understood that X is nothing but the matrix which contains the data which has to be classified. In matlab help its mentioned that "If y is a categorical variable, character array, or cell array of strings, classregtree performs classification",
> but i am unable to understand how to create this cell array y i.e on what basis are the contents of y choosen?
> Please somebody help me out, its urgent cause this is my final year project in engg and i have got just two weeks left.

Hi y is your response variable. It's just telling you that the response is often not quantitative and if it's not, then you have a classification problem.

Look at the documentation and you will see an example that involves the data fisheriris:
Load the data:
>>load fisheriris

Now look at the cell array species. That is an example of a response variable that is not quantitative.

Wayne
From: Tom Lane on
>> i want to use the classification and regression tree algorithm to
>> classify some data which i have collected. I came across the command
>> t=classregtree(X,y) in matlab where i understood that X is nothing but
>> the matrix which contains the data which has to be classified. In matlab
>> help its mentioned that "If y is a categorical variable, character array,
>> or cell array of strings, classregtree performs classification",
>> but i am unable to understand how to create this cell array y i.e on what
>> basis are the contents of y choosen?
>> Please somebody help me out, its urgent cause this is my final year
>> project in engg and i have got just two weeks left.
>
> Hi y is your response variable. It's just telling you that the response is
> often not quantitative and if it's not, then you have a classification
> problem.

I agree with everything Wayne wrote, but if you say you want to "classify"
some data and you don't know how to create the array of classes, could it be
that you really want to "cluster" the data? If so, take a look at the
kmeans, gmdistribution, and linkage functions. Or type "help stats" and look
at the things under the "Cluster Analysis" heading.

-- Tom


From: Mohiyuddin on
"Wayne King" <wmkingty(a)gmail.com> wrote in message <hr96mt$9qv$1(a)fred.mathworks.com>...
> "Mohiyuddin " <mbyadgi(a)gmail.com> wrote in message <hr9549$r65$1(a)fred.mathworks.com>...
> > hi,
> > i want to use the classification and regression tree algorithm to classify some data which i have collected. I came across the command
> > t=classregtree(X,y)
> > in matlab where i understood that X is nothing but the matrix which contains the data which has to be classified. In matlab help its mentioned that "If y is a categorical variable, character array, or cell array of strings, classregtree performs classification",
> > but i am unable to understand how to create this cell array y i.e on what basis are the contents of y choosen?
> > Please somebody help me out, its urgent cause this is my final year project in engg and i have got just two weeks left.
>
> Hi y is your response variable. It's just telling you that the response is often not quantitative and if it's not, then you have a classification problem.
>
> Look at the documentation and you will see an example that involves the data fisheriris:
> Load the data:
> >>load fisheriris
>
> Now look at the cell array species. That is an example of a response variable that is not quantitative.
>
> Wayne


thanx for ur suggestion Wayne. I have looked this cell array species and it contains only the names of the species of flowers. Now my difficulty is how does the classregtree command classifies those flowers into 3 different species using just a matrix 'meas' containing measurement data and a cell array containing only the name of species. Doesn't this command requires some extra deciding parameters to classify those flowers?
Now if i collect the same measurement data(i.e sepal length, sepal width,petal length and petal width), of a new flower whose class is not known then how can i assign a class to this flower using this CART algorithm?
From: Mohiyuddin on
"Tom Lane" <tlane(a)mathworks.com> wrote in message <hr9f44$qk2$1(a)fred.mathworks.com>...
> >> i want to use the classification and regression tree algorithm to
> >> classify some data which i have collected. I came across the command
> >> t=classregtree(X,y) in matlab where i understood that X is nothing but
> >> the matrix which contains the data which has to be classified. In matlab
> >> help its mentioned that "If y is a categorical variable, character array,
> >> or cell array of strings, classregtree performs classification",
> >> but i am unable to understand how to create this cell array y i.e on what
> >> basis are the contents of y choosen?
> >> Please somebody help me out, its urgent cause this is my final year
> >> project in engg and i have got just two weeks left.
> >
> > Hi y is your response variable. It's just telling you that the response is
> > often not quantitative and if it's not, then you have a classification
> > problem.
>
> I agree with everything Wayne wrote, but if you say you want to "classify"
> some data and you don't know how to create the array of classes, could it be
> that you really want to "cluster" the data? If so, take a look at the
> kmeans, gmdistribution, and linkage functions. Or type "help stats" and look
> at the things under the "Cluster Analysis" heading.
>
> -- Tom
>


thank you very much Tom but i don't want to cluster the data but classify the data. I have training data i.e, the classes to which this data belong is known. My task is to identify the class of newly collected data using the training database and CART classifier.
I have looked into the example of 'fisheriris' mentioned in matlab help. There in the cell array 'species' contains only the names of the species of flowers. Now my difficulty is how does the classregtree command classifies those flowers into 3 different species using just a matrix 'meas' containing measurement data and a cell array containing only the name of species. Doesn't this command requires some extra deciding parameters to classify those flowers?
Now if i collect the same measurement data(i.e sepal length, sepal width,petal length and petal width), of a new flower whose class is not known then how can i identify the class of this flower using this CART algorithm?