Prev: delaunay
Next: adding data to uitable
From: Greg Heath on 20 Feb 2010 00:31 On Feb 9, 4:41 pm, "George Burdell" <gburde...(a)gmail.com> wrote: > I am new to theneuralnetworks toolbox, and am confused with its usage. Please take a look at the following simple code, and the output I received: > ----------------------------------------------- > >> p = [0 1 2 3 4 5 6 7 8]; > >> t = [ 0 1 0 0 1 0 1 1 0 ]; > >> net = newff(p,t,20, {'tansig' 'logsig'}); > >> [net,tr,y1] = train(net,p,t); > > TRAINLM-calcjx, Epoch 0/100, MSE 0.313186/0, Gradient 0.0821321/1e-010 > TRAINLM-calcjx, Epoch 6/100, MSE 0.142905/0, Gradient 0.000191683/1e-010 > TRAINLM, Validation stop. > >> t(tr.trainInd) What is tr.trainInd or p(t.trainInd)? The training and validation sets were randomly chosen. However, you don't know what you trained on. So how can you interpret the behavior of the trained net? > ans = > 0 0 0 1 1 1 0 > >> y1 > > y1 = > 0.0001 0.0001 0.0000 0.9974 0.9958 0.9757 0.0001 > > >> y2 = sim(net,p) > > y2 = > 0.9483 0.6281 0.5996 0.5099 0.5625 0.8746 0.9630 0.6854 0.6178 > > >> y2(tr.trainInd) > > 0.9483 0.5996 0.5099 0.5625 0.9630 0.6854 0.6178 > -------------------------------------------------- > My questions are: > 1. Aren't y1 and y2(tr.trainInd) supposed to be the same, > or at least very similar? Why are they so different? The subset of p chosen for training was not sufficiently large to accomodate the choice of H = 20 for the number of hidden nodes. H = 20 grossly overfit the net and led to overtraining i.e., the net had so many weights that it could memorize the training subset without being able to generalize to nontraining data. NOTE: If you plot (p,t) you will see that the specified I/O doesn't seem to represent the characteristics of a deterministic function. Therefore, this seems to be a bad example. The only thing that can be accomplished here is to use the large number of hidden nodes to memorize the training subset. However, when the entire set is used as input, the net fails > 2. It seems to me that theneuralnetwork was properly trained, > at least for the training data it selected from t and p. No. For proper training you need the training and validation sets to be sufficiently large. See my post on pre training advice. Search the group archives using "greg heath" pre training advice for newbies "greg heath" partitiion "greg heath" Ntrn Nval Ntst >But why are the values of y2 so off, when it comes from applying >the training data on the neuralnetwork? Not all of p was used for training. The net could not generalize to nontraining data The net automatically normalized the training inputs > 3. I have also noticed that values of y2 never go below 0.5, even when using very large sizes for p and t (t is always either 0 or 1). Why does this happen? The "logsig" function maps numbers to the range (0,1). I must be doing something completely >wrong here, but I have no idea what. Check to see what automatic normalizations were done. Hope this helps. Greg
From: Greg Heath on 20 Feb 2010 00:38 On Feb 9, 5:32 pm, "George Burdell" <gburde...(a)gmail.com> wrote: > "Daniel Blackmer" <daniel.black...(a)remove.thisgmail.dotcom> wrote in message <hksm8b$cp...(a)fred.mathworks.com>... > > > If you want your output values to be between 0 and 1 you should set your target > > vector to -1. WHAT???? ABSOLUTELY NOT!! ...... SOMETHING IS VERY WRONG HERE. > Dont forget that logsig(0) = .5 That is evaluating itself correctly > for what you have supplied it with. I got quite confused by this too! > > Thanks for your help. So the target vectors I provide to "train" must be >values "before" the output transfer function (logsig)? NOOOOOOO! > Why doesn't it just let you supply the "final" values? And why specifically "-1"? > Why not -2, -3, -100, etc.? logsig would evaluate to 0 for those values too. I don't have the latest version of the NN Toolbox. Please consult MATLAB regarding this. Something is obviously wrong. Please post once you find the answer. Hope this helps. Greg
From: Greg Heath on 20 Feb 2010 00:43
On Feb 9, 6:09 pm, "Daniel Blackmer" <daniel.black...(a)remove.thisgmail.dotcom> wrote: > "George Burdell" <gburde...(a)gmail.com> wrote in message <hksnp4$ju...(a)fred.mathworks.com>... > > "Daniel Blackmer" <daniel.black...(a)remove.thisgmail.dotcom> wrote in message <hksm8b$cp...(a)fred.mathworks.com>... > > > > If you want your output values to be between 0 and 1 you should set your target vector to -1. Dont forget that logsig(0) = .5 That is evaluating itself correctly for what you have supplied it with. I got quite confused by this too! > > > Thanks for your help. So the target vectors I provide to "train" must be values "before" the output transfer function (logsig)? Why doesn't it just let you supply the "final" values? And why specifically "-1"? Why not -2, -3, -100, etc.? logsig would evaluate to 0 for those values too. > > like I told you, I was a little befuddled by this when I first started as well, and since my field of research involves lots of binary outputs I will honestly tell you that I just gave up on using logsig because it never gave me what I thought it should, and since most of the time I train with baysian regularization, I find that it doesn't even let it converge correctly. Check the siurce code: type logsig > Well if you are curious about the -100, I cant explain what it is actually doing there, but something else is going on there to make it not QUITE as simple as the logsig of the output. (for example I tried logsig(-100) which of course should be 0. However if you make that one of your target values that output will converge to -49.5). I cant fully explain it too you, however I have just learned to work within the framework of how it DOES work with outputs that will be meaningful to whatever task I am trying to >accomplish with it. Please alert MATLAB of this behavior. > It is strange however, because if you look at the help under backpropigation, the diagram they show with the tansig / purelin network, and the equations they show explaining its behavior do seem to indicate that the output is literally going to be logsig(n2) which is mathematically impossible if you are getting outputs of -49.5. Again, something is wrong. logsig worked properly in older versions. > I will also point out that for this sort of 'function aproximation' problem you are doing you prolly would want to stick to purelin outputs. I have found that for this simple sort of problem its a much better way to go than the sigmoids, and tends to yield better results. >Your problem for example can solve this with a purelin in 5 hidden nodes no problem. > > Hope my rambling insights help! Again, please notify MATLAB of any inapprppriate behavior. Hope this helps. Greg |