Help needed with neural network [Matlab]

Prev: problem with net.performance goal in MLP
Next: Weighted sampling without replacement

From: Mancomb Seepgood on 23 Apr 2010 10:17

Greg Heath <heath(a)alumni.brown.edu> wrote in message <999ba3ed-9bd8-49df-8e29-2811dfc2d238(a)x38g2000vbx.googlegroups.com>...
> On Apr 22, 8:23 am, "Mancomb Seepgood"
> <ikarigendokunREMOV...(a)yahoo.com> wrote:
> > > > > > net = newfit(inputs,targets,numHiddenNeurons,{'purelin','tansig'},...
> > > > > > 'trainrp','learngdm');
> >
> > > > > Replace {'purelin','tansig'} with {'tansig' , 'purelin' }
> >
> > > > I tried, and i get very weird resilts. I have tried replacing it and then
> > > > trying to change >number of epochs or some other > parameters but i dont
> > > > get good results...
> >
> > > Nevertheless, that is my STRONG recommendation.
> >
> > these are some of the outputs for {'tansig','purelin'}.
> > Unless i use rand('seed',pi), outputs keep hanging...
>
> http://img205.imageshack.us/img205/2751/ann3g.jpg
> http://img690.imageshack.us/img690/6354/ann2a.jpg
> http://img203.imageshack.us/img203/7791/ann1w.jpg
> >
> > how come network is generated in much better way when
> > i'm using {'purelin','tansig'}?
>
> Some thing is very wrong.
>
> First:
>
> 1. PURELIN is appropriate for any output. However,
> 2. TANSIG is more appropriate when physics or math
> restricts the output to finite bipolar ranges
> that can be linearly transformed to [-1,1]
> 2. LOGSIG is more appropriate when physics or math
> restricts the output to finite unipolar ranges
> that can be linearly transformed to [0,1]
>
> Now, you have used both PURELIN and TANSIG outputs
> however, your plots show outputs that are restricted
> to [0,1].
>
> Please explain the reason for this.
I am sorry, i forgot to mention, i am scaling my outputs to [0,1] in order to compare them because they aren't in the same range

>
> -----SNIP
> > SSE00 = 1.9309e+005
> > MSE00 = 198.2450
> -----SNIP
> > SSE0 = 1.9818e+004
> > MSE0 = 20.3894
> > Rsq0 = 0.8974
> > Rsqa0 = 0.8972
>
>
> OK. With a linear model already yielding R^2 ~ 0.9
> It doesn't seem like you would need a huge number of
> hidden nodes to get close to 0.99.
>
> Can you duplicate the linear model results with H = 1?
>
For H=1 the results are:
Nw = 5
MSEgoal = 1.9723
SSEtrn = 1.6535e+004
MSEtrn = 17.0466
Rsqtrn = 0.9144
Rsqatrn = 0.9140

> Make a reasonable search for an "optimal" H:
> Loop over H = [1: Hmax] with, say
>
> Hmax = floor((Neq/10-O)/(I+O+1)) = 24
>
> 5 or 10 trials for each H is not unreasonable.
>
>
> > Nw = 81
> > MSEgoal = 1.8178
> > SSEtrn = 1.2475e+004
> > MSEtrn = 13.9543
> > Rsqtrn = 0.9354
> > Rsqatrn = 0.9296
>
> >
Done, and this is what i got for optimal H=13
MSEgoal = 1.8747
SSEtrn = 1.2412e+004
MSEtrn = 13.4620
Rsqtrn = 0.9357
Rsqatrn = 0.9321

> > Notice that MSEgoal is 1.8178 and in training performanse
> > i get this:
> > best validation performance is 14.73 (which means mse=14.73)
> > at epoch 11.
>
> Explain what you mean by validation performance.
> You used all of your data for training.
>
> What do you mean by epoch 11. If things are going
> right, the mse should be asymptotically declining.
>
By validation performance i mean in nntraintool the button that plots performance (plotperform). You can see what it is if you type in matlab:
doc plotperform
or in my case:
http://img293.imageshack.us/img293/2461/performance.jpg
So, after epoch 11 network stopped learning. After that it has (default) 6 validation checks and when it reaches 6th the training stops and network gives us best output.

> > > > > If you mean R^2 = 0.95, why are you dissatisfied?
> >
> > > > I am not satisfied because i thought mse should be
> > > > even lower. I am satisfied with regression being 0.95.
>
> The statements are contradictory.
>
> Given MSE00, there is a one-to-one relationship between
> MSE and R^2.
>
> > > Terminology: R-squared being 0.95
> >
> > Thanks, thats what i ment.
> >
> > > >Anyway, this is what i get as a comparison between output of my network and output >that i was kind of supposed to get (mufr fuzzy is output from fuzzy that i am using as >kind of validation from my output from my net).http://img245.imageshack.us/i/annfuzzy.jpg/
> > > > [img]http://img245.imageshack.us/i/annfuzzy.jpg/[/img]
> > > > What is your opinion on this?
> >
> > > I'm not satisfied either. If the fuzzy output is correct,
> > > then you are doing something wrong with the NN training.
> >
> > > Are you sure fuzzy is correct?
> >
> If your output matches your target but
> doesn't match some external comparison
> standard. How can the net be at fault?
>
> Overlay the plot of target and fuzzy.
> If they don't match why should output
> and fuzzy match?

One update. I have got information from my mentor that the fuzzy output doesn't necessarily have to be correct which means that, after all, network might be giving good results. The thing is that my network is made based on mathematically calculated data set, and fuzzy is just based on logistics. I am sorry i provided invalid info that fuzzy was actually 100% correct, but that was only my assumption.

> You have 975 cases. However, your
> abcissa only goes up to 250.
>
> Why?
>
That's because i am simulating network with [2x233] matrix, and also the fuzzy output its [1x233] vector. I'm sorry i forgot to mention that.

Thank you Greg, you have been more than helpful!
regards,
Mancomb Seepgood.

From: Mancomb Seepgood on 22 Apr 2010 08:23

> > > > net = newfit(inputs,targets,numHiddenNeurons,{'purelin','tansig'},'trainrp','learngdm');
> >
> > > Replace {'purelin','tansig'} with {'tansig' , 'purelin' }
> >
> > I tried, and i get very weird resilts. I have tried replacing it and then trying to change >number of epochs or some other > parameters but i dont get good results...
>
> Nevertheless, that is my STRONG recommendation.
>
these are some of the outputs for {'tansig','purelin'}.
Unless i use rand('seed',pi), outputs keep changing..
http://img205.imageshack.us/img205/2751/ann3g.jpg
http://img690.imageshack.us/img690/6354/ann2a.jpg
http://img203.imageshack.us/img203/7791/ann1w.jpg

how come network is generated in much better way when i'm using {'purelin','tansig'}?

>
> [I Ntrn] = size(ptrn)% ptrn is the input matrix used for training
> [O Ntrn] = size(ttrn)% ttrn is the target matrix used for training
>
> For a Naive Constant Model ( constant output independent of ptrn)
>
> y00 = repmat(mean(ttrn,2),1,Ntrn); % Output
> Nw00 = O % No. of estimated "weights"
> e00 = ttrn-y00; % Error
> SSE00 = sse(e00) % Sum-Squared-Error
> MSE00 = SSE00/(Ntrn*O-Nw00) % Mean-Squared-Error (DOF adj)
> Rsq00 = 1-MSE00/MSE00 % R-squared stat (= 0 for constant model)
>
> The degree-of-freedom (DOF) adjustment of MSE is needed to mitigate
> the optimistic bias of using the same data for training and error
> estimation. The MATLAB function mse(e00) only yields the unadjusted
> value sse(e00)/(Ntrn*O).
>
> For a Linear Model
>
> W = ttrn/[ones(1,Ntrn); ptrn]; % Weight matrix
> sizeW = [O I+1]
> Nw0 = O*(I+1) % No. of estimated weights
>
> y0 = W * [ones(1,Ntrn) ; ptrn]; % Output
> e0 = ttrn-y0; % Error
> SSE0 = sse(e0) % Sum-Squared-Error
> MSE0 = SSE0/(Ntrn*O-Nw0) % Mean-Squared-Error (DOF adj)
> Rsq0 = 1 - SSE0/SSE00 % R-squared statistic
> Rsqa0 = 1 - MSE0/MSE00 % adjusted R-squared stat
> = 1-(SSE0/(Ntrn-I-1))/(SSE00/(Ntrn-1))
>
> For an I-H-O MLP NN Model with default design settings
>
> net = newff(minmax(ptrn),[H O]); % I-H-O MLP creation
> Nw = (I+1)*H+(H+1)*O % No. of estimated weights
> MSEgoal = 0.01*((Ntrn-Nw)/(Ntrn-1))*(SSE00/Ntrn)% MSE training goal
> net.Param.goal = MSEgoal;
>
> net = train(net,ptrn,ttrn); % I-H-O MLP design
> ytrn = sim(net,ptrn); % Training Output
> etrn = ttrn-ytrn;
> SSEtrn = sse(etrn)
> MSEtrn = SSEtrn/(Ntrn*O-Nw)
> Rsqtrn = 1 - SSEtrn/SSE00
> Rsqatrn = 1 - MSSEtrn/MSE00
> = 1 - (SSEtrn/(Ntrn-Nw))/(SSE00/(Ntrn-1))
>
> Performance is data dependent. When the error is
> zero, Rsqatrn = 1. However, there is rarely a good
> reason for trying to exceed Rsqatrn = 0.99.
> That is the rationale for choosing the above
> value of MSEgoal
>
> For either a design validation set (used repeatedly to
> determine H) or a test set (only used once to estimate
> performance on all nondesign data)
>
> y = sim(net,p);
> e = t-y;
> SSE = sse(e)
> MSE = SSE/N
> Rsq = 1 - SSE/SSE00
>
> A DOF adjustment is not needed for nontraining data.
>
> Successful design characteristics
> 1. Either MSEgoal is achieved or MSEtrn is mimimized
> 2. MSEval and MSEtst are satisfactory
> 3. Changing H will not improve performance.
>
ok. So ,this is what i got when i used your code above:
I =

2

Ntrn =

975

O =

1

Ntrn =

975

Nw00 =

1

SSE00 =

1.9309e+005

MSE00 =

198.2450

Rsq00 =

0

sizeW =

1 3

Nw0 =

3

SSE0 =

1.9818e+004

MSE0 =

20.3894

Rsq0 =

0.8974

Rsqa0 =

0.8972

Rsqa0 =

0.8972

Nw =

81

MSEgoal =

1.8178

SSEtrn =

1.2475e+004

MSEtrn =

13.9543

Rsqtrn =

0.9354

Rsqatrn =

0.9296

Rsqatrn =

0.9296

Notice that MSEgoal is 1.8178 and in training performanse i get this:
best validation performance is 14.73 (which means mse=14.73) at epoch 11.

> > > If you mean R^2 = 0.95, why are you dissatisfied?
> >
> > I am not satisfied because i thought mse should be even lower. I am satisfied with
> > regression being 0.95.
>
> Terminology: R-squared being 0.95
>
Thanks, thats what i ment.

> >Anyway, this is what i get as a comparison between output of my network and output >that i was kind of supposed to get (mufr fuzzy is output from fuzzy that i am using as >kind of validation from my output from my net). http://img245.imageshack.us/i/annfuzzy.jpg/
> > [img]http://img245.imageshack.us/i/annfuzzy.jpg/[/img]
> > What is your opinion on this?
>
> I'm not satisfied either. If the fuzzy output is correct,
> then you are doing something wrong with the NN training.
>
> Are you sure fuzzy is correct?

yes. I didn't do fuzzy though, i was only given the results so i can compare.

thanks.

From: Greg Heath on 21 Apr 2010 03:32

On Apr 20, 10:01 am, "Mancomb Seepgood"
<ikarigendokunREMOV...(a)yahoo.com> wrote:
> > > numHiddenNeurons = 20;
>
> > You should probably have H = numHiddenNeurons as an input
> > parameter and may need to use trial and error to determine a
> > practical value for H.
>
> > To train an I-H-O MLP with Ntrn I/O observations it is desirable to
> > have
>
> > Neq(Number of training equations) >> Nw(Number of unknown weights)
> > You have
> > Nw = (I+1)*H+(H+1)*O = 60+21 = 81
> > Neq = Ntrn*O = 975*1 = 975
> > r = Neq/Nw > 12
>
> > which looks OK
>
> thanks for this info.
>
> > > net = newfit(inputs,targets,numHiddenNeurons,{'purelin','tansig'},'trainrp','learngdm');
>
> > Replace {'purelin','tansig'} with {'tansig' , 'purelin' }
>
> I tried, and i get very weird resilts. I have tried replacing it and then trying to change >number of epochs or some other > parameters but i dont get good results...

Nevertheless, that is my STRONG recommendation.

>
> > > Thats pretty much it. After that i simulate network using some other set that i have for verifications but when i compare the results, i don't get what i need.
> > > What is the problem?
>
> > Order of your activation functions?
>
> > Have you designed a linear backslash model? What is the
> > resulting R^2 and how does it compare with yout NN results.
>
> I don't think i understood what did you ask here. I don't know
> what is a linear backslash model, sorry, i am kind of new to
> this.

Sorry. The Statistics Toolbox row/column convention for
data matrices are the transpose of those for the NN Toolbox.
The STB solution for the linear model X*B = Y is B=X\Y via
the backslash operator. As seen below, the NNTB convention
results in using the slash operator solution.

help slash

[I Ntrn] = size(ptrn)% ptrn is the input matrix used for training
[O Ntrn] = size(ttrn)% ttrn is the target matrix used for training

For a Naive Constant Model ( constant output independent of ptrn)

y00 = repmat(mean(ttrn,2),1,Ntrn); % Output
Nw00 = O % No. of estimated "weights"
e00 = ttrn-y00; % Error
SSE00 = sse(e00) % Sum-Squared-Error
MSE00 = SSE00/(Ntrn*O-Nw00) % Mean-Squared-Error (DOF adj)
Rsq00 = 1-MSE00/MSE00 % R-squared stat (= 0 for constant model)

The degree-of-freedom (DOF) adjustment of MSE is needed to mitigate
the optimistic bias of using the same data for training and error
estimation. The MATLAB function mse(e00) only yields the unadjusted
value sse(e00)/(Ntrn*O).

For a Linear Model

W = ttrn/[ones(1,Ntrn); ptrn]; % Weight matrix
sizeW = [O I+1]
Nw0 = O*(I+1) % No. of estimated weights

y0 = W * [ones(1,Ntrn) ; ptrn]; % Output
e0 = ttrn-y0; % Error
SSE0 = sse(e0) % Sum-Squared-Error
MSE0 = SSE0/(Ntrn*O-Nw0) % Mean-Squared-Error (DOF adj)
Rsq0 = 1 - SSE0/SSE00 % R-squared statistic
Rsqa0 = 1 - MSE0/MSE00 % adjusted R-squared stat
= 1-(SSE0/(Ntrn-I-1))/(SSE00/(Ntrn-1))

For an I-H-O MLP NN Model with default design settings

net = newff(minmax(ptrn),[H O]); % I-H-O MLP creation
Nw = (I+1)*H+(H+1)*O % No. of estimated weights
MSEgoal = 0.01*((Ntrn-Nw)/(Ntrn-1))*(SSE00/Ntrn)% MSE training goal
net.Param.goal = MSEgoal;

net = train(net,ptrn,ttrn); % I-H-O MLP design
ytrn = sim(net,ptrn); % Training Output
etrn = ttrn-ytrn;
SSEtrn = sse(etrn)
MSEtrn = SSEtrn/(Ntrn*O-Nw)
Rsqtrn = 1 - SSEtrn/SSE00
Rsqatrn = 1 - MSSEtrn/MSE00
= 1 - (SSEtrn/(Ntrn-Nw))/(SSE00/(Ntrn-1))

Performance is data dependent. When the error is
zero, Rsqatrn = 1. However, there is rarely a good
reason for trying to exceed Rsqatrn = 0.99.
That is the rationale for choosing the above
value of MSEgoal

For either a design validation set (used repeatedly to
determine H) or a test set (only used once to estimate
performance on all nondesign data)

y = sim(net,p);
e = t-y;
SSE = sse(e)
MSE = SSE/N
Rsq = 1 - SSE/SSE00

A DOF adjustment is not needed for nontraining data.

Successful design characteristics
1. Either MSEgoal is achieved or MSEtrn is mimimized
2. MSEval and MSEtst are satisfactory
3. Changing H will not improve performance.

> > > i tried with trainlm function and i tried changing parameters like turning off deviding net.divideFcn=''; or like changing functions and changing number of hidden neurons to 1, 10, 20, 30, 50, 100, 200, 300,... and i still don't get good results. After training the network i get these results:
> > > best performance is at epoch 15 for example and mse is 15 or so (its around 15 whether divide is turned off or on) and regression is 0.95.
> > > Is there something that i'm doing wrong?
>
> > If you mean R^2 = 0.95, why are you dissatisfied?
>
> I am not satisfied because i thought mse should be even lower. I am satisfied with
> regression being 0.95.

Terminology: R-squared being 0.95

>Anyway, this is what i get as a comparison between output of my network and output >that i was kind of supposed to get (mufr fuzzy is output from fuzzy that i am using as >kind of validation from my output from my net). http://img245.imageshack.us/i/annfuzzy.jpg/
> [img]http://img245.imageshack.us/i/annfuzzy.jpg/[/img]
> What is your opinion on this?

I'm not satisfied either. If the fuzzy output is correct,
then you are doing something wrong with the NN training.

Are you sure fuzzy is correct?

Hope this helps.

Greg

From: Mancomb Seepgood on 26 Apr 2010 18:48

One more question. I still have problems with another similar network that I'm making. This time, R^2=0.6 or so... plus, output is awful. I am training the net the same as the one above, but can't seem to figure why this big change? I mean why mse being like 600 or so and R being 0.6 or so.. Here are some images of inputs, target and the output. It might give give a clue about if data is wrong or no.
mdf and rms that i make into [2x999] :
http://img405.imageshack.us/img405/3056/mdf.jpg
http://img16.imageshack.us/img16/1392/rmsc.jpg
target:
http://img16.imageshack.us/img16/9229/targetq.jpg
and output:
http://img196.imageshack.us/img196/2432/outo.jpg

thanks in advance.

From: Greg Heath on 26 Apr 2010 23:31

On Apr 26, 6:48 pm, "Mancomb Seepgood"
<ikarigendokunREMOV...(a)yahoo.com> wrote:
> One more question. I still have problems with another similar network that I'm making. This time, R^2=0.6 or so... plus, output is awful. I am training the net the same as the one above, but can't seem to figure why this big change? I mean why mse being like 600 or so and R being 0.6 or so.. Here are some images of inputs, target and the output. It might give give a clue about if data is wrong or no.
> mdf and rms that i make into [2x999] :http://img405.imageshack.us/img405/3056/mdf.jpghttp://img16.imageshack.us/img16/1392/rmsc.jpg
> target:http://img16.imageshack.us/img16/9229/targetq.jpg
> and output:http://img196.imageshack.us/img196/2432/outo.jpg
>
> thanks in advance.

Obviously an error in coding somewhere.
It looks like you should subtract your output from unity.

Greg

First | Prev |
Pages: 1 2
Prev: problem with net.performance goal in MLP
Next: Weighted sampling without replacement