From: Adham on
I am doing classification using SVD+NN for two classes.
the original data has 16 channel, 2500 samples and 280 epochs but since the result was so poor, I thought maybe the problem is with the data so I created a random dataset D with Target T. The result is still no good. I would appreciate it if somebody could give me some advise on this.

%%
% producing a random dataset of 4 channels, 100 samples, 200 epochs
D=rand(4,100,200);
T=round(rand(2,200));// target, 2 classes
%%
[i2,j2,z2]=size(D);
for i=1:i2
for j=1:j2
P(i*j,:)=D(i,j,:);
end
end

%%

[U,S,V] = svd(P,'econ');
test1=U'*P;
test2=S*V';
%class responces
%%
P=double(test1);
T=double(T);
%%
[tridx,testidx,validx] = srs(T,[0.8 0.1 0.1]);
% the srs function provides a set of indexes for training, testing and validation data % sets based on the percentages that are passed trough it (in here, 80% training, % 10%testing, and 10% validating)
Val.P=P(:,validx);
Val.T=T(:,validx);

Tr.P=P(:,tridx);
Tr.T=T(:,tridx);
Ts.P=P(:,testidx);
Ts.T=T(:,testidx);
%%
net = newff(Tr.P,Tr.T,40,{'tansig' 'purelin'},'traingdx');
net.trainParam.show = inf;
net.trainParam.goal = 1e-5;
net.trainParam.lr = 0.01;
net.trainparam.lr_inc = 1.05;
net.trainparam.lr_dec = 0.7;
net.trainparam.max_perfect_inc = 1.04;
net.trainparam.mc = 0.9;
net.trainparam.min_grad = 1e-10;
net.trainParam.epochs = z2;

[net,tr] = train(net,Tr.P,Tr.T, [],[],Val);
y = sim(net,Ts.P);
%%
% CM is the contagency matrix
cm = faull(compet(y)*compet(Ts.T)')
acc = sum(diag(cm)) / sum(cm(:));
From: Greg Heath on
12345678901234567890123456789012345678901234567890123456789012345
On Apr 12, 10:36 pm, "Adham " <atya0...(a)flinders.edu.au> wrote:
> I am doing classification using SVD+NN for two classes.

1. Why are you using SVD? I have never seen this done before.

> the original data has 16 channel, 2500 samples and 280 epochs

2. Incorrect Terminology:

You have 1 data sample containing N observations (or cases)
of I = 16 input variables.

I can't tell what N is because I don't know what you mean by
using the term epoch to describe data.

Epoch is an algorithm parameter; not a data descriptor:
For a batch learning algorithm all of the training
data passes through the algorithm each epoch.


>but since the result was so poor, I thought maybe the problem
is with the data so I created a random dataset D with Target T.
The result is still no good. I would appreciate it if somebody
>could give me some advise on this.

3. It might help to read my posts on pretraining advice. Search
keywords

greg heath pre training advice for newbies
greg heath Neq Nw

4. NN Toolbox Data Convention

[I N] = size(P) % I = number of input variables,
% N = number of cases
[O N] = size(T) % O = number of output variables

> % producing a random dataset of 4 channels, 100 samples,
> 200 epochs
> D=rand(4,100,200);
> T=round(rand(2,200));// target, 2 classes

5. Syntax: Replace "\\" with "%"

O = 2
N = 200

6. Error: Only need one output for 2 classes. If you
use 2 only allow [1 0]' and [0 1]'. [1 1]' and [0 0]'
are not allowed.

> %%
> [i2,j2,z2]=size(D);
> for i=1:i2
> for j=1:j2
> P(i*j,:)=D(i,j,:);
> end
> end

I = 400
N = 200

7. I = 400 makes no sense to me. Please explain what
you mean by channels, samples and epochs. There is
a huge misunderstanding somewhere.

8. This appears to be a case of GIGO since there
is no deterministic relationship between P and T.

> %%
>
> [U,S,V] = svd(P,'econ');

9. Incorrect syntax

doc svd
help svd

> test1=U'*P;
> test2=S*V';
> %class responces
> %%
> P=double(test1);
> T=double(T);

10. "double" is unnecessary

> %%
> [tridx,testidx,validx] = srs(T,[0.8 0.1 0.1]);
> % the srs function provides a set of indexes for training,
testing and validation data % sets based on the percentages
that are passed trough it (in here, 80% training,
> % 10%testing, and 10% validating)

> Val.P=P(:,validx);
> Val.T=T(:,validx);
>
> Tr.P=P(:,tridx);
> Tr.T=T(:,tridx);
> Ts.P=P(:,testidx);
> Ts.T=T(:,testidx);
> %%
> net = newff(Tr.P,Tr.T,40,{'tansig' 'purelin'},'traingdx');

11. How do you justify using H = 40 hidden nodes?
See my above posts with keywords Neq Nw.

12. If your outputs are zeros and ones, why aren't you
using LOGSIG for the output activation?

13. Why aren't you using the default training algorithm
TRAINLM?

> net.trainParam.show = inf;
> net.trainParam.goal = 1e-5;
> net.trainParam.lr = 0.01;
> net.trainparam.lr_inc = 1.05;
> net.trainparam.lr_dec = 0.7;
> net.trainparam.max_perfect_inc = 1.04;
> net.trainparam.mc = 0.9;
> net.trainparam.min_grad = 1e-10;
> net.trainParam.epochs = z2;

14. Why aren't you using the default training parameters?
What is z2?

> [net,tr] = train(net,Tr.P,Tr.T, [],[],Val);
> y = sim(net,Ts.P);
> %%
> % CM is the contagency matrix

15. contingency?

> cm = faull(compet(y)*compet(Ts.T)')

16. Unknown function FAULL

> acc = sum(diag(cm)) / sum(cm(:));

Hope this helps.

Greg

From: Adham on
Greg Heath <heath(a)alumni.brown.edu> wrote in message <7412eefe-febf-4b4d-a414-c94375c6f862(a)r27g2000yqn.googlegroups.com>...
> 12345678901234567890123456789012345678901234567890123456789012345
> On Apr 12, 10:36 pm, "Adham " <atya0...(a)flinders.edu.au> wrote:
> > I am doing classification using SVD+NN for two classes.
>
> 1. Why are you using SVD? I have never seen this done before.
>
> > the original data has 16 channel, 2500 samples and 280 epochs
>
> 2. Incorrect Terminology:
>
> You have 1 data sample containing N observations (or cases)
> of I = 16 input variables.
>
> I can't tell what N is because I don't know what you mean by
> using the term epoch to describe data.
>
> Epoch is an algorithm parameter; not a data descriptor:
> For a batch learning algorithm all of the training
> data passes through the algorithm each epoch.
>
>
> >but since the result was so poor, I thought maybe the problem
> is with the data so I created a random dataset D with Target T.
> The result is still no good. I would appreciate it if somebody
> >could give me some advise on this.
>
> 3. It might help to read my posts on pretraining advice. Search
> keywords
>
> greg heath pre training advice for newbies
> greg heath Neq Nw
>
> 4. NN Toolbox Data Convention
>
> [I N] = size(P) % I = number of input variables,
> % N = number of cases
> [O N] = size(T) % O = number of output variables
>
> > % producing a random dataset of 4 channels, 100 samples,
> > 200 epochs
> > D=rand(4,100,200);
> > T=round(rand(2,200));// target, 2 classes
>
> 5. Syntax: Replace "\\" with "%"
>
> O = 2
> N = 200
>
> 6. Error: Only need one output for 2 classes. If you
> use 2 only allow [1 0]' and [0 1]'. [1 1]' and [0 0]'
> are not allowed.
>
> > %%
> > [i2,j2,z2]=size(D);
> > for i=1:i2
> > for j=1:j2
> > P(i*j,:)=D(i,j,:);
> > end
> > end
>
> I = 400
> N = 200
>
> 7. I = 400 makes no sense to me. Please explain what
> you mean by channels, samples and epochs. There is
> a huge misunderstanding somewhere.
>
> 8. This appears to be a case of GIGO since there
> is no deterministic relationship between P and T.
>
> > %%
> >
> > [U,S,V] = svd(P,'econ');
>
> 9. Incorrect syntax
>
> doc svd
> help svd
>
> > test1=U'*P;
> > test2=S*V';
> > %class responces
> > %%
> > P=double(test1);
> > T=double(T);
>
> 10. "double" is unnecessary
>
> > %%
> > [tridx,testidx,validx] = srs(T,[0.8 0.1 0.1]);
> > % the srs function provides a set of indexes for training,
> testing and validation data % sets based on the percentages
> that are passed trough it (in here, 80% training,
> > % 10%testing, and 10% validating)
>
> > Val.P=P(:,validx);
> > Val.T=T(:,validx);
> >
> > Tr.P=P(:,tridx);
> > Tr.T=T(:,tridx);
> > Ts.P=P(:,testidx);
> > Ts.T=T(:,testidx);
> > %%
> > net = newff(Tr.P,Tr.T,40,{'tansig' 'purelin'},'traingdx');
>
> 11. How do you justify using H = 40 hidden nodes?
> See my above posts with keywords Neq Nw.
>
> 12. If your outputs are zeros and ones, why aren't you
> using LOGSIG for the output activation?
>
> 13. Why aren't you using the default training algorithm
> TRAINLM?
>
> > net.trainParam.show = inf;
> > net.trainParam.goal = 1e-5;
> > net.trainParam.lr = 0.01;
> > net.trainparam.lr_inc = 1.05;
> > net.trainparam.lr_dec = 0.7;
> > net.trainparam.max_perfect_inc = 1.04;
> > net.trainparam.mc = 0.9;
> > net.trainparam.min_grad = 1e-10;
> > net.trainParam.epochs = z2;
>
> 14. Why aren't you using the default training parameters?
> What is z2?
>
> > [net,tr] = train(net,Tr.P,Tr.T, [],[],Val);
> > y = sim(net,Ts.P);
> > %%
> > % CM is the contagency matrix
>
> 15. contingency?
>
> > cm = faull(compet(y)*compet(Ts.T)')
>
> 16. Unknown function FAULL
>
> > acc = sum(diag(cm)) / sum(cm(:));
>
> Hope this helps.
>
> Greg

Dear Greg
Thanks for your respond. I read your previous posts and they were so great. However, even with the simplest setup (net=newff(P,T), net=train(P,T) and using a 3*5 matrix for P and 1*5 for T I could not reach to a better accuracy than 50%.
I already tried different setups for NN including using logsig and tansig for input and hidden layers. the project is about EEG signal classification. the actual data is coming from a structur containing a cell 1*280 and each cell has 118*2500.
280 is the number of epochs (in brain signal), 118 is the number of channels/electrods and 2500 is the number of samples. this is stored in the following code in the d0 matrix (118*2500*280). The target is T and it is a 2*280 matrix.
About using a single vector as T for classifying between two classes, I already tried that. however, in the current target matrix (T), it is guaranteed that there are no [1 1] or [0 0]. The other reason for this is that I need to make contingency matrix for which I need a matrix instead of a vector

[i2,j2,z2]=size(d0); %i2=118, j2=2500, z2=280
P=reshape(d0,i2*j2,z2);

[I N]=size(P) I=295000 N=280
%%
% svd is used as feature extractor and since the data size is huge, the 'econ' is used for memory management
[u,s,v]=svd(P,'econ');
test=u'*P;
P=double(test);% the type of P is single and I can't pass it trough NN unless it is double
% [I N]=size(P) I=280 N=280
T(1,:) = ismember(group,'foot');
T(2,:) = ismember(group,'right');
T=double(T);% T is a 2*280 matrix containing 0 and 1
% [O N]=size(T) O=2 N=280
%%
[tridx,tsidx,vidx]=srs(T,[0.6 0.2 0.2]);
tr.P=P(:,tridx);%
tr.T=T(:,tridx);%
ts.P=P(:,tsidx);%
ts.T=T(:,tsidx);%
val.P=P(:,vidx);%
val.T=T(:,vidx);%
net=newff(tr.P,tr.T,40,{'tansig', 'logsig'},'trainscg');
%%
%I use trainscg since trainlm needs so much memory that my pc can not manage it
% the number of neurons is set based on previous reports with the same data. they reached to 90% accuracy using 40 neuron, tansig, logsig, trainlm, learngdm, and msereg
net=init(net);
[net,trainrec]=train(net,tr.P,tr.T,[],[],val);

%%
y=sim(net,ts.P);
cm=full(compet(y)*compet(ts.T)')
acc=sum(diag(cm))/sum(cm(:))
From: Greg Heath on
On Apr 13, 11:09 pm, "Adham " <atya0...(a)flinders.edu.au> wrote:
> Greg Heath <he...(a)alumni.brown.edu> wrote in message <7412eefe-febf-4b4d-a414-c94375c6f...(a)r27g2000yqn.googlegroups.com>...
> > 12345678901234567890123456789012345678901234567890123456789012345
> > On Apr 12, 10:36 pm, "Adham " <atya0...(a)flinders.edu.au> wrote:
> > > I am doing classification using SVD+NN for two classes.
>
> > 1. Why are you using SVD? I have never seen this done before.
>
> > > the original data has 16 channel, 2500 samples and 280 epochs
>
> > 2. Incorrect Terminology:
>
> >    You have 1 data sample containing N observations (or cases)
> >    of I = 16 input variables.
>
> >    I can't tell what N is because I don't know what you mean by
> >    using the term epoch to describe data.
>
> >    Epoch is an algorithm parameter; not a data descriptor:
> >       For a batch learning algorithm all of the training
> >       data passes through the algorithm each epoch.
>
> > >but since the result was so poor, I thought maybe the problem
> > is with the data so I created a random dataset D with Target T.
> > The result is still no good. I would appreciate it if somebody
> > >could give me some advise on this.
>
> > 3. It might help to read my posts on pretraining advice. Search
> >    keywords
>
> >    greg heath pre training advice for newbies
> >    greg heath Neq Nw
>
> > 4. NN Toolbox Data Convention
>
> >    [I N] = size(P)   % I = number  of input variables,
> >                      % N = number of cases
> >    [O N] = size(T)   % O = number of output variables
>
> > > % producing a random dataset of 4 channels, 100 samples,
> > > 200 epochs
> > > D=rand(4,100,200);
> > > T=round(rand(2,200));// target, 2 classes
>
> > 5. Syntax: Replace "\\" with "%"
>
> > O = 2
> > N = 200
>
> > 6. Error: Only need one output for 2 classes. If you
> >    use 2 only allow [1 0]' and [0 1]'. [1 1]' and [0 0]'
> >    are not allowed.
>
> > > %%
> > > [i2,j2,z2]=size(D);
> > > for i=1:i2
> > > for j=1:j2
> > > P(i*j,:)=D(i,j,:);
> > > end
> > > end
>
> > I = 400
> > N = 200
>
> > 7. I = 400 makes no sense to me. Please explain what
> >    you mean by channels, samples and epochs. There is
> >    a huge misunderstanding somewhere.
>
> > 8. This appears to be a case of GIGO since there
> >    is no deterministic relationship between P and T.
>
> > > %%
>
> > > [U,S,V] = svd(P,'econ');
>
> > 9. Incorrect syntax
>
> >    doc svd
> >    help svd
>
> > > test1=U'*P;
> > > test2=S*V';
> > > %class responces
> > > %%
> > > P=double(test1);
> > > T=double(T);
>
> > 10. "double" is unnecessary
>
> > > %%
> > > [tridx,testidx,validx] = srs(T,[0.8 0.1 0.1]);
> > > % the srs function  provides a set of indexes for training,
> > testing and validation data % sets based on the percentages
> > that are passed trough it (in here, 80% training,
> > > % 10%testing, and 10% validating)
>
> > >                 Val.P=P(:,validx);
> > >                 Val.T=T(:,validx);
>
> > >                 Tr.P=P(:,tridx);
> > >                 Tr.T=T(:,tridx);
> > >                 Ts.P=P(:,testidx);
> > >                 Ts.T=T(:,testidx);
> > > %%
> > > net = newff(Tr.P,Tr.T,40,{'tansig' 'purelin'},'traingdx');
>
> > 11. How do you justify using H = 40 hidden nodes?
> >     See my above posts with keywords Neq Nw.
>
> > 12. If your outputs are zeros and ones, why aren't you
> >      using LOGSIG for the output activation?
>
> > 13. Why aren't you using the default training algorithm
> >     TRAINLM?
>
> > > net.trainParam.show = inf;
> > > net.trainParam.goal = 1e-5;
> > > net.trainParam.lr = 0.01;
> > > net.trainparam.lr_inc = 1.05;
> > > net.trainparam.lr_dec = 0.7;
> > > net.trainparam.max_perfect_inc = 1.04;
> > > net.trainparam.mc = 0.9;
> > > net.trainparam.min_grad = 1e-10;
> > > net.trainParam.epochs = z2;
>
> > 14. Why aren't you using the default training parameters?
> >     What is z2?
>
> > > [net,tr] = train(net,Tr.P,Tr.T, [],[],Val);
> > > y = sim(net,Ts.P);
> > > %%
> > > % CM is the contagency matrix
>
> > 15. contingency?
>
> > > cm = faull(compet(y)*compet(Ts.T)')
>
> > 16. Unknown function FAULL
>
> > > acc = sum(diag(cm)) / sum(cm(:));
>
> > Hope this helps.
>
> > Greg
>
> Dear Greg
> Thanks for your respond. I read your previous posts and they were so great. >However, even with the simplest setup (net=newff(P,T), net=train(P,T)

net=newff(P,T,H);
net=train(net,P,T);


>and using a 3*5 matrix for P and 1*5 for T I could not reach to a better accuracy
> than 50%.

What is it about the data that you think you should do better than
50%?
How many hidden nodes?


> I already tried different setups for NN including using logsig and tansig for input and > hidden layers.

input???

Use tansig for hidden, logsig for output; transform inputs to have
zero mean.

>the project is about EEG signal classification. the actual data is coming from a structur containing a cell 1*280 and each cell has 118*2500.
> 280 is the number of epochs (in brain signal),

Is "epoch" really a medical term used to characterize EEGs?
Even if it is, don't use it with neural networks where epoch
has a different meaning.

> 118 is the number of
> channels/electrods and 2500 is the number of samples. this is stored in the following code in the d0 matrix (118*2500*280). The target is T and it is a 2*280 matrix.
> About using a single vector as T for classifying between two classes, I already tried that. however, in the current target matrix (T), it is guaranteed that there are no [1 1] or [0 0]. The other reason for this is that I need to make contingency matrix for which I >need a matrix instead of a vector

It can be done with either.

Is the term "contingency matrix" commonly used in EEG classification?
"Confusion matrix" is the corresponding NN terminology.

>
> [i2,j2,z2]=size(d0); %i2=118, j2=2500, z2=280
> P=reshape(d0,i2*j2,z2);
>
>  [I N]=size(P) I=295000 N=280
> %%
> % svd is used as feature extractor and since the data size is huge, the 'econ' is used for memory management
> [u,s,v]=svd(P,'econ');

'econ' is not a valid input with my MATLAB 6.5. I have to use '0'
to effect 'econ'.

For regression, PCA is typically used for input variable reduction.
However, the corresonding cov or corr matrix has dimensions [ I I],
which, in this case, would cause problems. Therefore, I see why you
are using sSVD.

However, I don't recommend PCA or SVD for variable reduction
with classifiers because they do not necessarily help. A better
method is PLS (partial least squares). See wikipedia and ir's
references.

Also see (watch out for URL wrap-around)

http://www.mathkb.com/Uwe/Forum.aspx/statistics/4427/PCA-for-cluster-detection

http://groups.google.com/group/comp.ai.neural-nets/msg/e003d9d49b5454c7?hl=en

> test=u'*P;
> P=double(test);% the type of P is single and I can't pass it trough NN unless it is > double

Aha! ...the measurement data is single...OK

> % [I N]=size(P) I=280 N=280
> T(1,:) = ismember(group,'foot');
> T(2,:) = ismember(group,'right');

'foot' and 'right'? Please explain

> T=double(T);% T is a 2*280 matrix containing 0 and 1
> % [O N]=size(T) O=2 N=280
> %%
> [tridx,tsidx,vidx]=srs(T,[0.6 0.2 0.2]);
> tr.P=P(:,tridx);%
> tr.T=T(:,tridx);%
> ts.P=P(:,tsidx);%
> ts.T=T(:,tsidx);%
> val.P=P(:,vidx);%
> val.T=T(:,vidx);%
> net=newff(tr.P,tr.T,40,{'tansig', 'logsig'},'trainscg');
> %%
> %I use trainscg since trainlm needs so much memory that my pc can not manage it
> % the number of neurons is set based on previous reports with the same data. they reached to 90% accuracy using 40 neuron, tansig, logsig, trainlm, learngdm, and msereg
> net=init(net);
> [net,trainrec]=train(net,tr.P,tr.T,[],[],val);
>
>  %%
> y=sim(net,ts.P);
> cm=full(compet(y)*compet(ts.T)')
> acc=sum(diag(cm))/sum(cm(:))

Good Luck.

Greg

From: Adham on
Dear Greg

thanks for you help. I was performing some extra tests and I realized that after performing SVD, the results come similar to the results of random data sets. Later on, I also did some down sampling on my data and without doing any data decomposition or feature extraction, the results with a basic tansig,logsig, trainscg neaural net reached to 70%. considering that my data is coming from a multidimension matrix (actually it is a cell array) of 118x2500x280 and my target is 2x280, what do you suggest for feature extraction. I need to reduce the size of my input data (P) before sending it to neural net or the accuracy drops dramatically. what I am doing now is performing 3 step of unfolding+svd+refolding in this way:
[i,j,k]=size(P);% i=118, j=2500,k=280
[d,dimsize,dimorder]=unfold(P,1,[2 3]); % d=118x700000
[u,s,v]=svd(d,'econ');
d=s*v';
P=refold(d,dimsize,dimorder);% 118x2500x280
[d,dimsize,dimorder]=unfold(P,2,[1 3]); % d=2500x33040
[u,s,v]=svd(d,'econ');
d=s*v';
P=refold(d,dimsize,dimorder);% 118x2500x280
[d,dimsize,dimorder]=unfold(P,3,[1 2]); % d=280x295000
[u,s,v]=svd(d,'econ');
d=s*v';
P=refold(d,dimsize,dimorder);% 118x2500x280
after this, I will down sample the data by factor of 2 in the second dimension, reshape it to a 2 dimensional matrix, dividing it to three sub datasets for training, testing and validation (Using SRS function that I explained before) and then pass it to nn. However, the size of the final P (before breaking it to three parts) is still huge and looks like nn memorize the data instead of adapting wights (if it is a good way to describe it). I would appreciate it if you could give me some suggestion about the way I am preparing my data for nn. I also nedd to know how to determine the best down sampling size/factor.


Cheers
Adham


Greg Heath <heath(a)alumni.brown.edu> wrote in message <718b9e72-3f5c-4516-9b1d-cbe218be9fb0(a)v14g2000yqb.googlegroups.com>...
> On Apr 13, 11:09 pm, "Adham " <atya0...(a)flinders.edu.au> wrote:
> > Greg Heath <he...(a)alumni.brown.edu> wrote in message <7412eefe-febf-4b4d-a414-c94375c6f...(a)r27g2000yqn.googlegroups.com>...
> > > 12345678901234567890123456789012345678901234567890123456789012345
> > > On Apr 12, 10:36 pm, "Adham " <atya0...(a)flinders.edu.au> wrote:
> > > > I am doing classification using SVD+NN for two classes.
> >
> > > 1. Why are you using SVD? I have never seen this done before.
> >
> > > > the original data has 16 channel, 2500 samples and 280 epochs
> >
> > > 2. Incorrect Terminology:
> >
> > >    You have 1 data sample containing N observations (or cases)
> > >    of I = 16 input variables.
> >
> > >    I can't tell what N is because I don't know what you mean by
> > >    using the term epoch to describe data.
> >
> > >    Epoch is an algorithm parameter; not a data descriptor:
> > >       For a batch learning algorithm all of the training
> > >       data passes through the algorithm each epoch.
> >
> > > >but since the result was so poor, I thought maybe the problem
> > > is with the data so I created a random dataset D with Target T.
> > > The result is still no good. I would appreciate it if somebody
> > > >could give me some advise on this.
> >
> > > 3. It might help to read my posts on pretraining advice. Search
> > >    keywords
> >
> > >    greg heath pre training advice for newbies
> > >    greg heath Neq Nw
> >
> > > 4. NN Toolbox Data Convention
> >
> > >    [I N] = size(P)   % I = number  of input variables,
> > >                      % N = number of cases
> > >    [O N] = size(T)   % O = number of output variables
> >
> > > > % producing a random dataset of 4 channels, 100 samples,
> > > > 200 epochs
> > > > D=rand(4,100,200);
> > > > T=round(rand(2,200));// target, 2 classes
> >
> > > 5. Syntax: Replace "\\" with "%"
> >
> > > O = 2
> > > N = 200
> >
> > > 6. Error: Only need one output for 2 classes. If you
> > >    use 2 only allow [1 0]' and [0 1]'. [1 1]' and [0 0]'
> > >    are not allowed.
> >
> > > > %%
> > > > [i2,j2,z2]=size(D);
> > > > for i=1:i2
> > > > for j=1:j2
> > > > P(i*j,:)=D(i,j,:);
> > > > end
> > > > end
> >
> > > I = 400
> > > N = 200
> >
> > > 7. I = 400 makes no sense to me. Please explain what
> > >    you mean by channels, samples and epochs. There is
> > >    a huge misunderstanding somewhere.
> >
> > > 8. This appears to be a case of GIGO since there
> > >    is no deterministic relationship between P and T.
> >
> > > > %%
> >
> > > > [U,S,V] = svd(P,'econ');
> >
> > > 9. Incorrect syntax
> >
> > >    doc svd
> > >    help svd
> >
> > > > test1=U'*P;
> > > > test2=S*V';
> > > > %class responces
> > > > %%
> > > > P=double(test1);
> > > > T=double(T);
> >
> > > 10. "double" is unnecessary
> >
> > > > %%
> > > > [tridx,testidx,validx] = srs(T,[0.8 0.1 0.1]);
> > > > % the srs function  provides a set of indexes for training,
> > > testing and validation data % sets based on the percentages
> > > that are passed trough it (in here, 80% training,
> > > > % 10%testing, and 10% validating)
> >
> > > >                 Val.P=P(:,validx);
> > > >                 Val.T=T(:,validx);
> >
> > > >                 Tr.P=P(:,tridx);
> > > >                 Tr.T=T(:,tridx);
> > > >                 Ts.P=P(:,testidx);
> > > >                 Ts.T=T(:,testidx);
> > > > %%
> > > > net = newff(Tr.P,Tr.T,40,{'tansig' 'purelin'},'traingdx');
> >
> > > 11. How do you justify using H = 40 hidden nodes?
> > >     See my above posts with keywords Neq Nw.
> >
> > > 12. If your outputs are zeros and ones, why aren't you
> > >      using LOGSIG for the output activation?
> >
> > > 13. Why aren't you using the default training algorithm
> > >     TRAINLM?
> >
> > > > net.trainParam.show = inf;
> > > > net.trainParam.goal = 1e-5;
> > > > net.trainParam.lr = 0.01;
> > > > net.trainparam.lr_inc = 1.05;
> > > > net.trainparam.lr_dec = 0.7;
> > > > net.trainparam.max_perfect_inc = 1.04;
> > > > net.trainparam.mc = 0.9;
> > > > net.trainparam.min_grad = 1e-10;
> > > > net.trainParam.epochs = z2;
> >
> > > 14. Why aren't you using the default training parameters?
> > >     What is z2?
> >
> > > > [net,tr] = train(net,Tr.P,Tr.T, [],[],Val);
> > > > y = sim(net,Ts.P);
> > > > %%
> > > > % CM is the contagency matrix
> >
> > > 15. contingency?
> >
> > > > cm = faull(compet(y)*compet(Ts.T)')
> >
> > > 16. Unknown function FAULL
> >
> > > > acc = sum(diag(cm)) / sum(cm(:));
> >
> > > Hope this helps.
> >
> > > Greg
> >
> > Dear Greg
> > Thanks for your respond. I read your previous posts and they were so great. >However, even with the simplest setup (net=newff(P,T), net=train(P,T)
>
> net=newff(P,T,H);
> net=train(net,P,T);
>
>
> >and using a 3*5 matrix for P and 1*5 for T I could not reach to a better accuracy
> > than 50%.
>
> What is it about the data that you think you should do better than
> 50%?
> How many hidden nodes?
>
>
> > I already tried different setups for NN including using logsig and tansig for input and > hidden layers.
>
> input???
>
> Use tansig for hidden, logsig for output; transform inputs to have
> zero mean.
>
> >the project is about EEG signal classification. the actual data is coming from a structur containing a cell 1*280 and each cell has 118*2500.
> > 280 is the number of epochs (in brain signal),
>
> Is "epoch" really a medical term used to characterize EEGs?
> Even if it is, don't use it with neural networks where epoch
> has a different meaning.
>
> > 118 is the number of
> > channels/electrods and 2500 is the number of samples. this is stored in the following code in the d0 matrix (118*2500*280). The target is T and it is a 2*280 matrix.
> > About using a single vector as T for classifying between two classes, I already tried that. however, in the current target matrix (T), it is guaranteed that there are no [1 1] or [0 0]. The other reason for this is that I need to make contingency matrix for which I >need a matrix instead of a vector
>
> It can be done with either.
>
> Is the term "contingency matrix" commonly used in EEG classification?
> "Confusion matrix" is the corresponding NN terminology.
>
> >
> > [i2,j2,z2]=size(d0); %i2=118, j2=2500, z2=280
> > P=reshape(d0,i2*j2,z2);
> >
> >  [I N]=size(P) I=295000 N=280
> > %%
> > % svd is used as feature extractor and since the data size is huge, the 'econ' is used for memory management
> > [u,s,v]=svd(P,'econ');
>
> 'econ' is not a valid input with my MATLAB 6.5. I have to use '0'
> to effect 'econ'.
>
> For regression, PCA is typically used for input variable reduction.
> However, the corresonding cov or corr matrix has dimensions [ I I],
> which, in this case, would cause problems. Therefore, I see why you
> are using sSVD.
>
> However, I don't recommend PCA or SVD for variable reduction
> with classifiers because they do not necessarily help. A better
> method is PLS (partial least squares). See wikipedia and ir's
> references.
>
> Also see (watch out for URL wrap-around)
>
> http://www.mathkb.com/Uwe/Forum.aspx/statistics/4427/PCA-for-cluster-detection
>
> http://groups.google.com/group/comp.ai.neural-nets/msg/e003d9d49b5454c7?hl=en
>
> > test=u'*P;
> > P=double(test);% the type of P is single and I can't pass it trough NN unless it is > double
>
> Aha! ...the measurement data is single...OK
>
> > % [I N]=size(P) I=280 N=280
> > T(1,:) = ismember(group,'foot');
> > T(2,:) = ismember(group,'right');
>
> 'foot' and 'right'? Please explain
>
> > T=double(T);% T is a 2*280 matrix containing 0 and 1
> > % [O N]=size(T) O=2 N=280
> > %%
> > [tridx,tsidx,vidx]=srs(T,[0.6 0.2 0.2]);
> > tr.P=P(:,tridx);%
> > tr.T=T(:,tridx);%
> > ts.P=P(:,tsidx);%
> > ts.T=T(:,tsidx);%
> > val.P=P(:,vidx);%
> > val.T=T(:,vidx);%
> > net=newff(tr.P,tr.T,40,{'tansig', 'logsig'},'trainscg');
> > %%
> > %I use trainscg since trainlm needs so much memory that my pc can not manage it
> > % the number of neurons is set based on previous reports with the same data. they reached to 90% accuracy using 40 neuron, tansig, logsig, trainlm, learngdm, and msereg
> > net=init(net);
> > [net,trainrec]=train(net,tr.P,tr.T,[],[],val);
> >
> >  %%
> > y=sim(net,ts.P);
> > cm=full(compet(y)*compet(ts.T)')
> > acc=sum(diag(cm))/sum(cm(:))
>
> Good Luck.
>
> Greg