Help needed with neural network [Matlab]

Prev: problem with net.performance goal in MLP
Next: Weighted sampling without replacement

From: Mancomb Seepgood on 16 Apr 2010 11:02

Hello people,
I am doing a project that involves making a neural network. I get data sets for that neural network from some other program that i use to generate 2 vectors: mdf that is <1x195> and rms that is also <1x195> which i merge into one input like inp1=[mdf;rms] so it's dimensions are <2x195>. I also get target vector fr that is <1x195>. So, in order to make neural net work better i made 5 of these sets.
Using this help
http://www.mathworks.com/access/helpdesk/help/toolbox/nnet/neuron_5.html
i figured i should use batch style for my network and so i made one input and target set made of all inputs and targets by doing this:

inp=[mdf1 mdf2 mdf3 mdf4 mdf5; rms1 rms2 rms3 rms4 rms5];%dimensions are 2x975
tgt=[fr1 fr2 fr3 fr4 fr5]; %dimensions are 1x975

Then, i put that data set through my neural net that i wrote in seperate m file (which looks quite simple):

function net = create_fit_net(inputs,targets)

numHiddenNeurons = 20;
net = newfit(inputs,targets,numHiddenNeurons,{'purelin','tansig'},'trainrp','learngdm');
net=init(net);
net=train(net,inputs,targets);

Thats pretty much it. After that i simulate network using some other set that i have for verifications but when i compare the results, i don't get what i need.
What is the problem?
i tried with trainlm function and i tried changing parameters like turning off deviding net.divideFcn=''; or like changing functions and changing number of hidden neurons to 1, 10, 20, 30, 50, 100, 200, 300,... and i still don't get good results. After training the network i get these results:
best performance is at epoch 15 for example and mse is 15 or so (its around 15 whether divide is turned off or on) and regression is 0.95.
Is there something that i'm doing wrong?

One more question:
while browsing i've seen some networks being written like
net=newff(minmax(input),[numHiddenNeurons,numOutputs]);
what is the difference between that and net=newff(inputs,targets,numHiddenNeurons) ? When i tried to do my network like that it wouldnt work. It said something like "Warning: NEWFF used in an obsolete way."
That's it for now. If you need more info please ask, i would be very grateful if someone would help me with this or just share an opinion.
Thank you,
Mancomb Seepgood.

From: Mancomb Seepgood on 18 Apr 2010 21:15

anyone? :(

From: Mancomb Seepgood on 20 Apr 2010 10:01

> > numHiddenNeurons = 20;
>
> You should probably have H = numHiddenNeurons as an input
> parameter and may need to use trial and error to determine a
> practical value for H.
>
> To train an I-H-O MLP with Ntrn I/O observations it is desirable to
> have
>
> Neq(Number of training equations) >> Nw(Number of unknown weights)
> You have
> Nw = (I+1)*H+(H+1)*O = 60+21 = 81
> Neq = Ntrn*O = 975*1 = 975
> r = Neq/Nw > 12
>
> which looks OK

thanks for this info.

> > net = newfit(inputs,targets,numHiddenNeurons,{'purelin','tansig'},'trainrp','learngdm');
>
> Replace {'purelin','tansig'} with {'tansig' , 'purelin' }

I tried, and i get very weird resilts. I have tried replacing it and then trying to change number of epochs or some other parameters but i dont get good results...

> > Thats pretty much it. After that i simulate network using some other set that i have for verifications but when i compare the results, i don't get what i need.
> > What is the problem?
>
> Order of your activation functions?
>
> Have you designed a linear backslash model? What is the
> resulting R^2 and how does it compare with yout NN results.
>

I don't think i understood what did you ask here. I don't know what is a linear backslash model, sorry, i am kind of new to this.

> > i tried with trainlm function and i tried changing parameters like turning off deviding net.divideFcn=''; or like changing functions and changing number of hidden neurons to 1, 10, 20, 30, 50, 100, 200, 300,... and i still don't get good results. After training the network i get these results:
> > best performance is at epoch 15 for example and mse is 15 or so (its around 15 whether divide is turned off or on) and regression is 0.95.
> > Is there something that i'm doing wrong?
>
> If you mean R^2 = 0.95, why are you dissatisfied?
>
I am not satisfied because i thought mse should be even lower. I am satisfied with regression being 0.95. Anyway, this is what i get as a comparison between output of my network and output that i was kind of supposed to get (mufr fuzzy is output from fuzzy that i am using as kind of validation from my output from my net).
http://img245.imageshack.us/i/annfuzzy.jpg/
[img]http://img245.imageshack.us/i/annfuzzy.jpg/[/img]
What is your opinion on this?

> > One more question:
> > while browsing i've seen some networks being written like
> > net=newff(minmax(input),[numHiddenNeurons,numOutputs]);
> > what is the difference between that and net=newff(inputs,targets,numHiddenNeurons) ? When i tried to do my network like that it wouldnt >work. It said something like "Warning: NEWFF used in an obsolete way."
>
> The former syntax was use in older versions.
>
Thank you, that's what i expected.

> > That's it for now. If you need more info please ask, i would be very grateful if someone would help me with this or just share an opinion.
>
>
> Hope this helps.
>
> Greg

Thanks a lot Greg.

From: Greg Heath on 19 Apr 2010 23:07

On Apr 16, 11:02 am, "Mancomb Seepgood"
<ikarigendokunREMOV...(a)yahoo.com> wrote:
> Hello people,
> I am doing a project that involves making a neural network. I get data sets for that neural network from some other program that i use to generate 2 vectors: mdf that is <1x195> and rms that is also <1x195> which i merge into one input like inp1=[mdf;rms] so it's dimensions are <2x195>. I also get target vector fr that is <1x195>. So, in order to make neural net work better i made 5 of these sets.
> Using this helphttp://www.mathworks.com/access/helpdesk/help/toolbox/nnet/neuron_5.html
> i figured i should use batch style for my network and so i made one input and target set made of all inputs and targets by doing this:
>
> inp=[mdf1 mdf2 mdf3 mdf4 mdf5; rms1 rms2 rms3 rms4 rms5];%dimensions are 2x975
> tgt=[fr1 fr2 fr3 fr4 fr5]; %dimensions are 1x975
>
> Then, i put that data set through my neural net that i wrote in seperate m file (which looks quite simple):
>
> function net = create_fit_net(inputs,targets)
>
> numHiddenNeurons = 20;

You should probably have H = numHiddenNeurons as an input
parameter and may need to use trial and error to determine a
practical value for H.

To train an I-H-O MLP with Ntrn I/O observations it is desirable to
have

Neq(Number of training equations) >> Nw(Number of unknown weights)
You have
Nw = (I+1)*H+(H+1)*O = 60+21 = 81
Neq = Ntrn*O = 975*1 = 975
r = Neq/Nw > 12

which looks OK

> net = newfit(inputs,targets,numHiddenNeurons,{'purelin','tansig'},'trainrp','learngdm');

Replace {'purelin','tansig'} with {'tansig' , 'purelin' }

> net=init(net);

Unnecessary. newfit automatically intializes with random weights

> net=train(net,inputs,targets);
>
> Thats pretty much it. After that i simulate network using some other set that i have for verifications but when i compare the results, i don't get what i need.
> What is the problem?

Order of your activation functions?

Have you designed a linear backslash model? What is the
resulting R^2 and how does it compare with yout NN results.

> i tried with trainlm function and i tried changing parameters like turning off deviding net.divideFcn=''; or like changing functions and changing number of hidden neurons to 1, 10, 20, 30, 50, 100, 200, 300,... and i still don't get good results. After training the network i get these results:
> best performance is at epoch 15 for example and mse is 15 or so (its around 15 whether divide is turned off or on) and regression is 0.95.
> Is there something that i'm doing wrong?

If you mean R^2 = 0.95, why are you dissatisfied?

> One more question:
> while browsing i've seen some networks being written like
> net=newff(minmax(input),[numHiddenNeurons,numOutputs]);
> what is the difference between that and net=newff(inputs,targets,numHiddenNeurons) ? When i tried to do my network like that it wouldnt >work. It said something like "Warning: NEWFF used in an obsolete way."

The former syntax was use in older versions.

> That's it for now. If you need more info please ask, i would be very grateful if someone would help me with this or just share an opinion.

Hope this helps.

Greg

From: Greg Heath on 22 Apr 2010 14:24

On Apr 22, 8:23 am, "Mancomb Seepgood"
<ikarigendokunREMOV...(a)yahoo.com> wrote:
> > > > > net = newfit(inputs,targets,numHiddenNeurons,{'purelin','tansig'},...
> > > > > 'trainrp','learngdm');
>
> > > > Replace {'purelin','tansig'} with {'tansig' , 'purelin' }
>
> > > I tried, and i get very weird resilts. I have tried replacing it and then
> > > trying to change >number of epochs or some other > parameters but i dont
> > > get good results...
>
> > Nevertheless, that is my STRONG recommendation.
>
> these are some of the outputs for {'tansig','purelin'}.
> Unless i use rand('seed',pi), outputs keep hanging...

http://img205.imageshack.us/img205/2751/ann3g.jpg
http://img690.imageshack.us/img690/6354/ann2a.jpg
http://img203.imageshack.us/img203/7791/ann1w.jpg
>
> how come network is generated in much better way when
> i'm using {'purelin','tansig'}?

Some thing is very wrong.

First:

1. PURELIN is appropriate for any output. However,
2. TANSIG is more appropriate when physics or math
restricts the output to finite bipolar ranges
that can be linearly transformed to [-1,1]
2. LOGSIG is more appropriate when physics or math
restricts the output to finite unipolar ranges
that can be linearly transformed to [0,1]

Now, you have used both PURELIN and TANSIG outputs
however, your plots show outputs that are restricted
to [0,1].

Please explain the reason for this.

What is minmax(t)?

Next: Hidden node PURELIN is just a matrix
multiplications that can be absorbed into the
output activation weights. Therefore, it is
never recommended for use in hidden layers.

If you get good reults with a PURELIN hidden
layer, the same results can be achieved with
no hidden layer.

> > [I Ntrn] = size(ptrn)% ptrn is the input matrix used for training
> > [O Ntrn] = size(ttrn)% ttrn is the target matrix used for training
>
> > For a Naive Constant Model ( constant output independent of ptrn)
>
> > y00 = repmat(mean(ttrn,2),1,Ntrn); % Output
> > Nw00 = O % No. of estimated "weights"
> > e00 = ttrn-y00; % Error
> > SSE00 = sse(e00) % Sum-Squared-Error
> > MSE00 = SSE00/(Ntrn*O-Nw00) % Mean-Squared-Error (DOF adj)
> > Rsq00 = 1-MSE00/MSE00 % R-squared stat (= 0 for constant model)
>
> > The degree-of-freedom (DOF) adjustment of MSE is needed to mitigate
> > the optimistic bias of using the same data for training and error
> > estimation. The MATLAB function mse(e00) only yields the unadjusted
> > value sse(e00)/(Ntrn*O).
>
> > For a Linear Model
>
> > W = ttrn/[ones(1,Ntrn); ptrn]; % Weight matrix
> > sizeW = [O I+1]
> > Nw0 = O*(I+1) % No. of estimated weights
>
> > y0 = W * [ones(1,Ntrn) ; ptrn]; % Output
> > e0 = ttrn-y0; % Error
> > SSE0 = sse(e0) % Sum-Squared-Error
> > MSE0 = SSE0/(Ntrn*O-Nw0) % Mean-Squared-Error (DOF adj)
> > Rsq0 = 1 - SSE0/SSE00 % R-squared statistic
> > Rsqa0 = 1 - MSE0/MSE00 % adjusted R-squared stat
> > = 1-(SSE0/(Ntrn-I-1))/(SSE00/(Ntrn-1))
>
> > For an I-H-O MLP NN Model with default design settings
>
> > net = newff(minmax(ptrn),[H O]); % I-H-O MLP creation
> > Nw = (I+1)*H+(H+1)*O % No. of estimated weights
> > MSEgoal = 0.01*((Ntrn-Nw)/(Ntrn-1))*(SSE00/Ntrn)% MSE training goal
> > net.Param.goal = MSEgoal;
>
> > net = train(net,ptrn,ttrn); % I-H-O MLP design
> > ytrn = sim(net,ptrn); % Training Output
> > etrn = ttrn-ytrn;
> > SSEtrn = sse(etrn)
> > MSEtrn = SSEtrn/(Ntrn*O-Nw)
> > Rsqtrn = 1 - SSEtrn/SSE00
> > Rsqatrn = 1 - MSSEtrn/MSE00
> > = 1 - (SSEtrn/(Ntrn-Nw))/(SSE00/(Ntrn-1))
>
> > Performance is data dependent. When the error is
> > zero, Rsqatrn = 1. However, there is rarely a good
> > reason for trying to exceed Rsqatrn = 0.99.
> > That is the rationale for choosing the above
> > value of MSEgoal
>
> > For either a design validation set (used repeatedly to
> > determine H) or a test set (only used once to estimate
> > performance on all nondesign data)
>
> > y = sim(net,p);
> > e = t-y;
> > SSE = sse(e)
> > MSE = SSE/N
> > Rsq = 1 - SSE/SSE00
>
> > A DOF adjustment is not needed for nontraining data.
>
> > Successful design characteristics
> > 1. Either MSEgoal is achieved or MSEtrn is mimimized
> > 2. MSEval and MSEtst are satisfactory
> > 3. Changing H will not improve performance.
>
> ok. So ,this is what i got when i used your code above:

-----SNIP
> SSE00 = 1.9309e+005
> MSE00 = 198.2450
-----SNIP
> SSE0 = 1.9818e+004
> MSE0 = 20.3894
> Rsq0 = 0.8974
> Rsqa0 = 0.8972

OK. With a linear model already yielding R^2 ~ 0.9
It doesn't seem like you would need a huge number of
hidden nodes to get close to 0.99.

Can you duplicate the linear model results with H = 1?

Make a reasonable search for an "optimal" H:
Loop over H = [1: Hmax] with, say

Hmax = floor((Neq/10-O)/(I+O+1)) = 24

5 or 10 trials for each H is not unreasonable.

> Nw = 81
> MSEgoal = 1.8178
> SSEtrn = 1.2475e+004
> MSEtrn = 13.9543
> Rsqtrn = 0.9354
> Rsqatrn = 0.9296

>
> Notice that MSEgoal is 1.8178 and in training performanse
> i get this:
> best validation performance is 14.73 (which means mse=14.73)
> at epoch 11.

Explain what you mean by validation performance.
You used all of your data for training.

What do you mean by epoch 11. If things are going
right, the mse should be asymptotically declining.

> > > > If you mean R^2 = 0.95, why are you dissatisfied?
>
> > > I am not satisfied because i thought mse should be
> > > even lower. I am satisfied with regression being 0.95.

The statements are contradictory.

Given MSE00, there is a one-to-one relationship between
MSE and R^2.

> > Terminology: R-squared being 0.95
>
> Thanks, thats what i ment.
>
> > >Anyway, this is what i get as a comparison between output of my network and output >that i was kind of supposed to get (mufr fuzzy is output from fuzzy that i am using as >kind of validation from my output from my net).http://img245.imageshack.us/i/annfuzzy.jpg/
> > > [img]http://img245.imageshack.us/i/annfuzzy.jpg/[/img]
> > > What is your opinion on this?
>
> > I'm not satisfied either. If the fuzzy output is correct,
> > then you are doing something wrong with the NN training.
>
> > Are you sure fuzzy is correct?
>
> yes. I didn't do fuzzy though, i was only given the results
> so i can compare.

If your output matches your target but
doesn't match some external comparison
standard. How can the net be at fault?

Overlay the plot of target and fuzzy.
If they don't match why should output
and fuzzy match?

You have 975 cases. However, your
abcissa only goes up to 250.

Why?

Overlay target, output and fuzzy.

Hope this helps.

Greg

| Next | Last
Pages: 1 2
Prev: problem with net.performance goal in MLP
Next: Weighted sampling without replacement