PCA [Matlab]

Prev: A bug in avifile?
Next: Parallel port

From: Greg Heath on 5 Sep 2006 22:59

Gian Piero Bandieramonte wrote:
> >> But is it only this scenario
> >> that makes inappropriate the use of PCA with classification
> problems?
> >> I got a 37-dimensional data, not a 3D data, and it is really
> tough to
> >> see if this scenario is happening to me. I don't know if there
> are
> >> matlab tools to assit me on this, or some theories....
>
> >You still haven't indicated the type of classifier you are using.
>
> The type of classifier I'm using is RBF (Radial Basis Networks) ,
> specifically using the function newrb.
>
> >> Greg said that if my test set is sufficiently large, then I
> could
> >> apply PCA with no correctness problems. My test data has been
> really
> >> large enough so there is no problem with it.
>
> > How large is it?
>
> Well, the mean of the sizes of the test sets is aprox 3000 rows. I
> hope this is large enough....
>
> >> But now another problem
> >> arises: if I now want to simulate my network
>
> > Is this a neural network model?
>
> Yes it is a neural network model.
>
> > That is exactly what I've been talking about!
> >
> > This is an insufficiently large test set for your method to deal
> > with.
> >
> > Reread my original post. Also read my post on pretraining advice.
> >
> > Hope this helps.
> >
> > Greg
>
> I first test my network with many test sets, of which the correct
> outputs are known, and of which the mean of those sets is 3000 (as I
> mentioned earlier on this reply). Then I simulate the network with
> other test set, of which whom the size is 1. As you said, it is too
> small and need to apply some transform, and you on other replies
> explained
>
> "You have to use the transformation matrix from the 1st batch
> to transform the second batch (instead of performing PCA on the
> second batch).
>
> I use eigs(corcoeff(X)) for PCA (instead of princomps), so
> I don't know if the transformation matrix is available to you
> without solving T*X = PC using T = X/PC."
>
> I don't get well from where I obtain the transformation matrix. Does
> eigs(corcoeff(X)) returns the transformation matrix?

Sorry.

I wrote that formula to explain why I can't help with the details of
using PRINCOMP... I do not use it. I was not suggesting that
you replace it with what i do.

The other point I was trying to make is "if you have input matrix
X and the resulting matrix PC from PRINCOMP, you can use
slash to determine a transformation matrix T. However, I don't
see why PRINCOMP or a related function doesn't provide T
directly.

> The corcoeff
> function is in fact the corrcoef function of matlab to obtain the
> correlation coefficients? Is X the 1st batch? Do I multiply the
> transformation matrix to the second batch as to transform it?

Yes. However, the transformation matrix for standardized
variables is obtained from the eigenvectors of the correlation
coefficient matrix.

> Thanks for your help...

You're welcome.

Hope this helps.

Greg

From: Greg Heath on 6 Sep 2006 08:37

Greg Heath wrote:
> Gian Piero Bandieramonte wrote:
> > >> But is it only this scenario
> > >> that makes inappropriate the use of PCA with classification
> > problems?
> > >> I got a 37-dimensional data, not a 3D data, and it is really
> > tough to
> > >> see if this scenario is happening to me. I don't know if there
> > are
> > >> matlab tools to assit me on this, or some theories....
> >
> > >You still haven't indicated the type of classifier you are using.
> >
> > The type of classifier I'm using is RBF (Radial Basis Networks) ,
> > specifically using the function newrb.
> >
> > >> Greg said that if my test set is sufficiently large, then I
> > could
> > >> apply PCA with no correctness problems. My test data has been
> > really
> > >> large enough so there is no problem with it.
> >
> > > How large is it?
> >
> > Well, the mean of the sizes of the test sets is aprox 3000 rows. I
> > hope this is large enough....
> >
> > >> But now another problem
> > >> arises: if I now want to simulate my network
> >
> > > Is this a neural network model?
> >
> > Yes it is a neural network model.
> >
> > > That is exactly what I've been talking about!
> > >
> > > This is an insufficiently large test set for your method to deal
> > > with.
> > >
> > > Reread my original post. Also read my post on pretraining advice.
> > >
> > > Hope this helps.
> > >
> > > Greg
> >
> > I first test my network with many test sets, of which the correct
> > outputs are known, and of which the mean of those sets is 3000 (as I
> > mentioned earlier on this reply). Then I simulate the network with
> > other test set, of which whom the size is 1. As you said, it is too
> > small and need to apply some transform, and you on other replies
> > explained
> >
> > "You have to use the transformation matrix from the 1st batch
> > to transform the second batch (instead of performing PCA on the
> > second batch).
> >
> > I use eigs(corcoeff(X)) for PCA (instead of princomps), so
> > I don't know if the transformation matrix is available to you
> > without solving T*X = PC using T = X/PC."
> >
> > I don't get well from where I obtain the transformation matrix. Does
> > eigs(corcoeff(X)) returns the transformation matrix?
>
> Sorry.
>
> I wrote that formula to explain why I can't help with the details of
> using PRINCOMP... I do not use it. I was not suggesting that
> you replace it with what i do.
>
> The other point I was trying to make is "if you have input matrix
> X and the resulting matrix PC from PRINCOMP, you can use
> slash to determine a transformation matrix T. However, I don't
> see why PRINCOMP or a related function doesn't provide T
> directly.
>
> > The corcoeff
> > function is in fact the corrcoef function of matlab to obtain the
> > correlation coefficients? Is X the 1st batch? Do I multiply the
> > transformation matrix to the second batch as to transform it?
>
> Yes. However, the transformation matrix for standardized
> variables is obtained from the eigenvectors of the correlation
> coefficient matrix.

Read my post on pretraining advice. There I recommend
PREPCA which outputs T and PC.

help prepca
help trapca

Also search the archives for threads containing

greg-heath prepca

Hope this helps.

Greg

From: Gian Piero Bandieramonte on 6 Sep 2006 17:26

> Read my post on pretraining advice. There I recommend
> PREPCA which outputs T and PC.
>
> help prepca
> help trapca
>
> Also search the archives for threads containing
>
> greg-heath prepca
>
> Hope this helps.
>
> Greg

I cannot find yoor pretraining advice anywhere in all the Matlab
central. Please point me out where to find it.

I have been researching the use of prepca and trapca, It seems good,
but I have a doubt regarding its use:

I test prepca with a vector, with this one:
v=[1 2 3 4 5;1 3 5 7 9;23 5 77 3 2; 3 5 7 35 456; 345 456 1234 568
34;234 523 235 123 34;1 234 63 346 23;234 234 234 234 234; 4 56 234
423 5;1 4 6 4 6; 23 6 234 63 2;34 345 3 346 74; 24 34 34 34 54; 34 23
2 4 5; 23 3 3 24 32; 2314 234 234 234 234; 2 4 52 14 52; 24 42 5 3
63; 2314 52 52 253 5; 213 4 12 42 52]

having

v =

1 2 3 4 5
1 3 5 7 9
23 5 77 3 2
3 5 7 35 456
345 456 1234 568 34
234 523 235 123 34
1 234 63 346 23
234 234 234 234 234
4 56 234 423 5
1 4 6 4 6
23 6 234 63 2
34 345 3 346 74
24 34 34 34 54
34 23 2 4 5
23 3 3 24 32
2314 234 234 234 234
2 4 52 14 52
24 42 5 3 63
2314 52 52 253 5
213 4 12 42 52

So I first apply

pn=prestd(p);

then

[ptrans,transMat] = prepca(pn',0.02);

Note that I apply prepca on the transpose of pn, because matlab
returns error if the matrix gas got more rows than columns. So ptrans
is in this case is a 1x20 vector.
But if instead of this, I do

pn=prestd(p');
[ptrans,transMat] = prepca(pn,0.02);

on where I transpose the matrix before applying the prestd function
and then applying prepca, ptrans in this case is a matrix of 5 rows
and 20 colums. The results are different, so, when am I supposed to
transpose?
Whatever case is the correct one, I should then transpose back to the
original state so as to have the original matrix but with less
columns meaning the dimensionality has been reduced. Is this correct?

From: Greg Heath on 6 Sep 2006 23:21

Gian Piero Bandieramonte wrote:
> > Read my post on pretraining advice. There I recommend
> > PREPCA which outputs T and PC.
> >
> > help prepca
> > help trapca
> >
> > Also search the archives for threads containing
> >
> > greg-heath prepca
> >
> > Hope this helps.
> >
> > Greg
>
> I cannot find yoor pretraining advice anywhere in all the Matlab
> central. Please point me out where to find it.

1. Go to Google Groups and search on

greg-heath pretraining-advice

2. Sort by date
3. It should be near the last post

Hope this helps.

Greg
------------SNIP

From: Gian Piero Bandieramonte on 7 Sep 2006 11:23

> 1. Go to Google Groups and search on
>
> greg-heath pretraining-advice
>
> 2. Sort by date
> 3. It should be near the last post
>
> Hope this helps.
>
> Greg
> ------------SNIP

I have read your post on pretraining advice, and what gets closest to
the answer to my question is your point 2 on your advice:

"2. Use TRANSPOSE and PRESTD to standardize the columns of Z. On
special occasions normalization to the bounded interval [-1,1]
(PREMNMX) is used for some columns. However, this is most
useful only if you know that all unknown data must fall
within the original bounds of the training data. "

It says to use transpose and prestd, but it does not tell me anything
about their order of use. So, should I use first transpose or prestd?
Sorry if I'm being tedious, I want to be sure of this before I change
my code from using princomp to using prepca.

Thanks...

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9
Prev: A bug in avifile?
Next: Parallel port