From: Greg Heath on 28 Apr 2010 07:44 On Apr 27, 9:26 pm, "John G" <a...(a)yahoo.com> wrote: > Peter Perkins <Peter.Perk...(a)MathRemoveThisWorks.com> wrote in message <hr81mf$nn...(a)fred.mathworks.com>... > > On 4/27/2010 8:04 PM, John G wrote: > > > The LDA built into the stats toolbox appears to assume covariances equal > > > & classes distributed normally, unlike Fisher LDA, > > > That _is_ Fisher LDA. How do you define it? If you want _unequal_ cov > > matrices, that's quadratic discriminant analysis. > > I guess I was wrong then. I thought Fisher's LDA was a bit different (Wikipedia says it doesn't necessarily make the same assumptions as regular LDA). > > How do you implement the MatLab LDA then? > > [C,err,P,logp,coeff] = classify(sample,training,group,'linear') > > but what would you use for group and training? The example is kind of unclear. I'm uncertain what a training data set is - is it a particular subset of the m x m array you're working with or can it be generalized to something else or what? In general, the total data set is partitioned into three subsets training design data used to directly determine the weights, given training parameters (e.g., the % of data used for each of the subsets and/or the prior probability weighting and misclassification costs for each class) validation nontraining design data repetitively used to estimate predictive performance so that training parameters can be optimized. test nondesign data used once and only once to obtain an unbiased estimate of predictive performance on unseen data. If you wish to retrain because the test set performance is significantly worse than validation set performance, you should repartition the data to try to keep the new test results as unbiased as possible. Typically, I find that when this happens 10-fold crossvalidation is a better alternative. Hope this helps. Greg
From: Peter Perkins on 28 Apr 2010 09:06 On 4/27/2010 9:26 PM, John G wrote: > I guess I was wrong then. I thought Fisher's LDA was a bit different > (Wikipedia says it doesn't necessarily make the same assumptions as > regular LDA). John, I guess I fall into the "The terms Fisher's linear discriminant and LDA are often used interchangeably" crowd that Wikipedia describes. But I think it's kind of like the "least squares vs. maximum likelihood for the normal distribution" distinction: you end up in the same place, but from two different justifications. By the way, the references that I looked at disagree with the Wikipedia article in that they use a pooled cov estimate for "Fisher's LDA". I was unaware that "Fisher's original article actually describes a slightly different discriminant, which does not make some of the assumptions of LDA such as ... equal class covariances". I take it on faith that the author of that statement read the paper, because I haven't. Unequal covariances lead to QDA if you look at likelihood ratios. You might consider using a classification tree if you're concerned about the assumptions of LDA. > How do you implement the MatLab LDA then? > > [C,err,P,logp,coeff] = classify(sample,training,group,'linear') > > but what would you use for group and training? The example is kind of > unclear. I'm uncertain what a training data set is - is it a particular > subset of the m x m array you're working with or can it be generalized > to something else or what? The training data are a set of observations for which you know both the predictor variables _and the class in which each observation falls_. Let's say you were trying to classify whether or not a new patient has a disease. You have records of 100 patients who are known to have it or not, as well as various other pieces of information about them. All of that is your training data. When you get a new patient, all you know is the "other information" and you try to predict in advance whether they will fall into the "disease" class or the "no disease" class.
From: Ting Su on 28 Apr 2010 11:14 John, By Fisher LDA, you probably mean a feature extraction (dimension reduction) methods which tries to find a projection maximizing the class separability. The linear discriminant analysis function in MATLAB's statistics toolbox is intended for classification (no projection is involved). Both methods are referred as LDA in the literature and have a common assumption, but they are different methods. It looks like Fisher's LDA is usually used to refer the feature extraction method. -Ting "John G" <asdf(a)yahoo.com> wrote in message news:hr7gbg$8gv$1(a)fred.mathworks.com... > I've got an m x m array. I want to apply Fisher discriminant analysis to > it - the LDA in MatLab's stats toolbox isn't the Fisher one so I used the > version provided by the supplementary toolbox stprtool package. > http://cmp.felk.cvut.cz/cmp/software/stprtool/index.html > > How do I run my program? I don't really understand the input: > > data [struct] Binary labeled training vectors. > .X [dim x num_data] Training vectors. > .y [1 x num_data] Labels (1 or 2) > > > Also, I'm not really sure I understand the concept of a training data set. > Does it have to be a subset of the array you want to analyze or a general > data set?
From: Ting Su on 28 Apr 2010 16:09 John, It turns out the function "classify" in the MATLAB's statistics toolbox returns the coefficients describing the boundary between the regions separating each pair of groups as the fifth ouput, so this fucntion can be used to find the projection maximizing the class separability. This projection is the hyperplane orthogonal to the decision boundary. -Ting "Ting Su" <Ting.Su(a)mathworks.com> wrote in message news:hr9jd1$j8e$1(a)fred.mathworks.com... > John, > > By Fisher LDA, you probably mean a feature extraction (dimension > reduction) methods which tries to find a projection maximizing the class > separability. The linear discriminant analysis function in MATLAB's > statistics toolbox is intended for classification (no projection is > involved). Both methods are referred as LDA in the literature and have a > common assumption, but they are different methods. It looks like Fisher's > LDA is usually used to refer the feature extraction method. > > > > > -Ting > > "John G" <asdf(a)yahoo.com> wrote in message > news:hr7gbg$8gv$1(a)fred.mathworks.com... >> I've got an m x m array. I want to apply Fisher discriminant analysis to >> it - the LDA in MatLab's stats toolbox isn't the Fisher one so I used the >> version provided by the supplementary toolbox stprtool package. >> http://cmp.felk.cvut.cz/cmp/software/stprtool/index.html >> >> How do I run my program? I don't really understand the input: >> >> data [struct] Binary labeled training vectors. >> .X [dim x num_data] Training vectors. >> .y [1 x num_data] Labels (1 or 2) >> >> >> Also, I'm not really sure I understand the concept of a training data >> set. Does it have to be a subset of the array you want to analyze or a >> general data set? > >
First
|
Prev
|
Pages: 1 2 Prev: Problem fitting svensson model by using lsqcurvefit Next: IMAQ getsnapshot |