Neural network toolbox acting weird? [Matlab]

Prev: delaunay
Next: adding data to uitable

From: George Burdell on 9 Feb 2010 16:41

I am new to the neural networks toolbox, and am confused with its usage. Please take a look at the following simple code, and the output I received:
-----------------------------------------------
>> p = [0 1 2 3 4 5 6 7 8];
>> t = [ 0 1 0 0 1 0 1 1 0 ];
>> net = newff(p,t,20, {'tansig' 'logsig'});
>> [net,tr,y1] = train(net,p,t);
TRAINLM-calcjx, Epoch 0/100, MSE 0.313186/0, Gradient 0.0821321/1e-010
TRAINLM-calcjx, Epoch 6/100, MSE 0.142905/0, Gradient 0.000191683/1e-010
TRAINLM, Validation stop.
>> t(tr.trainInd)
ans =
0 0 0 1 1 1 0
>> y1
y1 =
0.0001 0.0001 0.0000 0.9974 0.9958 0.9757 0.0001
>> y2 = sim(net,p)
y2 =
0.9483 0.6281 0.5996 0.5099 0.5625 0.8746 0.9630 0.6854 0.6178
>> y2(tr.trainInd)
0.9483 0.5996 0.5099 0.5625 0.9630 0.6854 0.6178
--------------------------------------------------
My questions are:
1. Aren't y1 and y2(tr.trainInd) supposed to be the same, or at least very similar? Why are they so different?
2. It seems to me that the neural network was properly trained, at least for the training data it selected from t and p. But why are the values of y2 so off, when it comes from applying the training data on the neural network?
3. I have also noticed that values of y2 never go below 0.5, even when using very large sizes for p and t (t is always either 0 or 1). Why does this happen? The "logsig" function maps numbers to the range (0,1). I must be doing something completely wrong here, but I have no idea what.

From: Daniel Blackmer on 9 Feb 2010 17:06

"George Burdell" <gburdell1(a)gmail.com> wrote in message <hkskq0$afs$1(a)fred.mathworks.com>...
> I am new to the neural networks toolbox, and am confused with its usage. Please take a look at the following simple code, and the output I received:
> -----------------------------------------------
> >> p = [0 1 2 3 4 5 6 7 8];
> >> t = [ 0 1 0 0 1 0 1 1 0 ];
> >> net = newff(p,t,20, {'tansig' 'logsig'});
> >> [net,tr,y1] = train(net,p,t);
> TRAINLM-calcjx, Epoch 0/100, MSE 0.313186/0, Gradient 0.0821321/1e-010
> TRAINLM-calcjx, Epoch 6/100, MSE 0.142905/0, Gradient 0.000191683/1e-010
> TRAINLM, Validation stop.
> >> t(tr.trainInd)
> ans =
> 0 0 0 1 1 1 0
> >> y1
> y1 =
> 0.0001 0.0001 0.0000 0.9974 0.9958 0.9757 0.0001
> >> y2 = sim(net,p)
> y2 =
> 0.9483 0.6281 0.5996 0.5099 0.5625 0.8746 0.9630 0.6854 0.6178
> >> y2(tr.trainInd)
> 0.9483 0.5996 0.5099 0.5625 0.9630 0.6854 0.6178
> --------------------------------------------------
> My questions are:
> 1. Aren't y1 and y2(tr.trainInd) supposed to be the same, or at least very similar? Why are they so different?
> 2. It seems to me that the neural network was properly trained, at least for the training data it selected from t and p. But why are the values of y2 so off, when it comes from applying the training data on the neural network?
> 3. I have also noticed that values of y2 never go below 0.5, even when using very large sizes for p and t (t is always either 0 or 1). Why does this happen? The "logsig" function maps numbers to the range (0,1). I must be doing something completely wrong here, but I have no idea what.

Just to point out here, your dividing your "training" and "validation" data. since you are looking for an exact result, you may want to just change your divideFcn to '' like this:
after running net = newff(); then do net.divideFcn = ''; or you can duplicate your target and input vectors and tell it to separate them in a block fashion instead of random.

If you want your output values to be between 0 and 1 you should set your target vector to -1. Dont forget that logsig(0) = .5 That is evaluating itself correctly for what you have supplied it with. I got quite confused by this too!

Hope this helps!

From: George Burdell on 9 Feb 2010 17:32

"Daniel Blackmer" <daniel.blackmer(a)remove.thisgmail.dotcom> wrote in message <hksm8b$cpe$1(a)fred.mathworks.com>...
>
> If you want your output values to be between 0 and 1 you should set your target vector to -1. Dont forget that logsig(0) = .5 That is evaluating itself correctly for what you have supplied it with. I got quite confused by this too!
>

Thanks for your help. So the target vectors I provide to "train" must be values "before" the output transfer function (logsig)? Why doesn't it just let you supply the "final" values? And why specifically "-1"? Why not -2, -3, -100, etc.? logsig would evaluate to 0 for those values too.

From: Daniel Blackmer on 9 Feb 2010 18:09

"George Burdell" <gburdell1(a)gmail.com> wrote in message <hksnp4$jum$1(a)fred.mathworks.com>...
> "Daniel Blackmer" <daniel.blackmer(a)remove.thisgmail.dotcom> wrote in message <hksm8b$cpe$1(a)fred.mathworks.com>...
> >
> > If you want your output values to be between 0 and 1 you should set your target vector to -1. Dont forget that logsig(0) = .5 That is evaluating itself correctly for what you have supplied it with. I got quite confused by this too!
> >
>
> Thanks for your help. So the target vectors I provide to "train" must be values "before" the output transfer function (logsig)? Why doesn't it just let you supply the "final" values? And why specifically "-1"? Why not -2, -3, -100, etc.? logsig would evaluate to 0 for those values too.

like I told you, I was a little befuddled by this when I first started as well, and since my field of research involves lots of binary outputs I will honestly tell you that I just gave up on using logsig because it never gave me what I thought it should, and since most of the time I train with baysian regularization, I find that it doesn't even let it converge correctly.

Well if you are curious about the -100, I cant explain what it is actually doing there, but something else is going on there to make it not QUITE as simple as the logsig of the output. (for example I tried logsig(-100) which of course should be 0. However if you make that one of your target values that output will converge to -49.5). I cant fully explain it too you, however I have just learned to work within the framework of how it DOES work with outputs that will be meaningful to whatever task I am trying to accomplish with it.

It is strange however, because if you look at the help under backpropigation, the diagram they show with the tansig / purelin network, and the equations they show explaining its behavior do seem to indicate that the output is literally going to be logsig(n2) which is mathematically impossible if you are getting outputs of -49.5.

I will also point out that for this sort of 'function aproximation' problem you are doing you prolly would want to stick to purelin outputs. I have found that for this simple sort of problem its a much better way to go than the sigmoids, and tends to yield better results. Your problem for example can solve this with a purelin in 5 hidden nodes no problem.

Hope my rambling insights help!

From: George Burdell on 10 Feb 2010 14:16

Thanks for your help!

| Next | Last
Pages: 1 2
Prev: delaunay
Next: adding data to uitable