Prev: Adding rotated images
Next: RADIUS
From: Mohammad A. Mezher on 24 Apr 2010 12:53 Hi, I am not sure about the function perfcurve in matlab.. the function takes number of input parameters: [X,Y] = perfcurve(labels,scores,posclass); what is the labels? what is the scores? I read all its documentation but can't find the data that i have to use in labels and in scores?? appreciate any clarification.. Thanks
From: Sadik on 24 Apr 2010 18:54 Hi Mohammad, The following example from the documentation is very illustrative. I am going to explain it a bit for you to better understand it: 1. load fisheriris %matlab's own dataset. Basically, there are three types of fish: setosa, versicolor and virginica [these names are in the variable species] and 50 samples per type. The first fifty is setosa, second fifty is versicolor and the third is virginica. 2. x = meas(51:end,1:2); % If you load the data, you will see that meas is a 150x4 matrix. There are 150 samples with 4 features per sample. x = meas(51:end,1:2) chooses the data pertaining to versicolor and virginica, and it is getting only 2 of the 4 features. 3. y = (1:100)'>50; % versicolor=0, virginica=1 % 50 zeros and 50 ones. This means, versicolor will be represented by zeroes and virginica by ones in the glm. 4. b = glmfit(x,y,'binomial'); % Obtain the model parameters. 5. p = glmval(b,x,'logit'); % Using these parameters, compute the output of the classifier. This is what goes into "scores" in perfcurve. 6. [X,Y,T,AUC] = perfcurve(species(51:end,:),p,'virginica'); % You can very easily see now. "labels" is nothing but a list of true labels you had in your data set. Since, after reduction, the dataset had 50 versicolors and 50 virginicas, "labels" is now a cell array where the first 50 elements are equal to the string 'versicolor' and the last 50 is equal to 'virginica'. % The last input to perfcurve "posclass" is the label of the positive class. If you look at line 3. above, we are assigning 1 to the second fifty, which is virginica. Therefore, the label of the positive class is 'virginica'. plot(X,Y) xlabel('False positive rate'); ylabel('True positive rate') title('ROC for classification by logistic regression') Best.
From: Mohammad A. Mezher on 24 Apr 2010 23:09 "Sadik " <sadik.hava(a)gmail.com> wrote in message <hqvsqd$i5d$1(a)fred.mathworks.com>... > Hi Mohammad, > > The following example from the documentation is very illustrative. I am going to explain it a bit for you to better understand it: > > 1. load fisheriris > %matlab's own dataset. Basically, there are three types of fish: setosa, versicolor and virginica [these names are in the variable species] and 50 samples per type. The first fifty is setosa, second fifty is versicolor and the third is virginica. > 2. x = meas(51:end,1:2); > % If you load the data, you will see that meas is a 150x4 matrix. There are 150 samples with 4 features per sample. x = meas(51:end,1:2) chooses the data pertaining to versicolor and virginica, and it is getting only 2 of the 4 features. > 3. y = (1:100)'>50; > % versicolor=0, virginica=1 > % 50 zeros and 50 ones. This means, versicolor will be represented by zeroes and virginica by ones in the glm. > 4. b = glmfit(x,y,'binomial'); > % Obtain the model parameters. > 5. p = glmval(b,x,'logit'); > % Using these parameters, compute the output of the classifier. This is what goes into "scores" in perfcurve. > 6. [X,Y,T,AUC] = perfcurve(species(51:end,:),p,'virginica'); > % You can very easily see now. "labels" is nothing but a list of true labels you had in your data set. Since, after reduction, the dataset had 50 versicolors and 50 virginicas, "labels" is now a cell array where the first 50 elements are equal to the string 'versicolor' and the last 50 is equal to 'virginica'. > % The last input to perfcurve "posclass" is the label of the positive class. If you look at line 3. above, we are assigning 1 to the second fifty, which is virginica. Therefore, the label of the positive class is 'virginica'. > plot(X,Y) > xlabel('False positive rate'); ylabel('True positive rate') > title('ROC for classification by logistic regression') > > Best. Thank you Sadik for your reply, i think i am missing something here you did not use any type of training and testing data for example: load ionodata % ionosphere dataset has A for data and groups for their labels indices = crossvalind('Kfold',groups,3); test = (indices == i); train = ~test; svmStruct = svmtrain(A(train),groups(train)); classes = svmclassify(svmStruct,A(test)); I am stuck here what kind of data i have to use in the perfcurve function pos =0; % for positive labels [X,Y,T,AUC] = perfcurve(labels,scores,pos); are the labels and scores here for labels and the confidence of training data or for the test data or for all dataset?? Thank you in advance..
From: Sadik on 25 Apr 2010 07:06 You should be using the labels of the testing data and the scores obtained by the testing data. Best.
From: Mohammad A. Mezher on 25 Apr 2010 13:04 "Sadik " <sadik.hava(a)gmail.com> wrote in message <hr17mt$ad4$1(a)fred.mathworks.com>... > You should be using the labels of the testing data and the scores obtained by the testing data. > > Best. Thank you for your reply. Do you thinking of any function to compute the score value (the confidence) of testing data?? i have the following equation d(x) = sum K(xi,x)*(alphai.*yi) - 0.5*sum K(xi,sv)*(alphai.*yi) xi training data yi labels of training data x testing data K() kernel function sv support vector
|
Pages: 1 Prev: Adding rotated images Next: RADIUS |