RBF neural network - How to estimate "spread" [Matlab]

Prev: problems with bwboundaries
Next: Pixel Scan

From: Balwinder Singh on 9 Nov 2009 17:25

Hi All,
In an RBF neural network what is the best way to get the spread?

In my problem, it seems like if I use the default spread (=1.0) I get spikes at the points here data is available and constant values otherwise. If I increase spread to 10.0, I get a regular curve but I am not sure how to get an optimal value for spread.

Also, if I over-fit the data by using max number of neurons, then the SSE approaches zero (at the very last neuron addition) otherwise, the SSE stay at a huge value (~400.0). There is a huge drop in SSE at the point when I add the last neuron to the network. Is this typical?

Thanks!

From: Greg Heath on 10 Nov 2009 21:17

On Nov 9, 5:25 pm, "Balwinder Singh" <balwindersi...(a)gmail.com> wrote:
> Hi All,
> In an RBF neural network what is the best way to get the spread?

Classification or regression?
Dimensionality of the input space?
Number of training vectors?

> In my problem, it seems like if I use the default spread (=1.0) I get spikes at the points here data is available and constant values otherwise.

How many hidden nodes generated?

If I increase spread to 10.0, I get a regular curve

How many hidden nodes generated?

but I am not sure how to get an optimal value for spread.

Trial and error.

> Also, if I over-fit the data by using max number of neurons, then the SSE approaches zero (at the very last neuron addition) otherwise, the SSE stay at a huge value (~400.0). There is a huge drop in SSE at the point when I add the last neuron to the network. Is this typical?

No.

More details needed.

Greg

From: Balwinder Singh on 11 Nov 2009 09:48

Greg Heath <heath(a)alumni.brown.edu> wrote in message <05f8fa2c-0e40-4bf0-ab08-478e305c0311(a)m13g2000vbf.googlegroups.com>...
> On Nov 9, 5:25?pm, "Balwinder Singh" <balwindersi...(a)gmail.com> wrote:
> > Hi All,
> > In an RBF neural network what is the best way to get the spread?
>
> Classification or regression?
-- I am working on a regression problem

> Dimensionality of the input space?
-- I have 2D input

> Number of training vectors?
-- Number of training vectors is 21
>
>
> > In my problem, it seems like if I use the default spread (=1.0) I get spikes at the points here data is available and constant values otherwise.
>
> How many hidden nodes generated?
--The SSE doesn't go below the goal (10^-5) , therefore the network generates 20 hidden nodes. At the 20th hidden node, I get a huge drop in the residual.

>
> If I increase spread to 10.0, I get a regular curve
>
> How many hidden nodes generated?
--Same as above

>
> but I am not sure how to get an optimal value for spread.
>
> Trial and error.
>
> > Also, if I over-fit the data by using max number of neurons, then the SSE approaches zero (at the very last neuron addition) otherwise, the SSE stay at a huge value (~400.0). There is a huge drop in SSE at the point when I add the last neuron to the network. Is this typical?
>
> No.
---Does that mean there is a problem in the algorithm (I am using my Fortran code based on the MATLAB's newrb function) or I just need to play with the "spread"?

> More details needed.
-- I am using 2D input with a spacing of 15 in x-direction and a spacing of 30 in the y-direction (training set). The spread values I tried so far, are 1 and 10.

I also did some testing on monotone (increasing) functions such as square root of a number. It seems like, for sparse data, if I use spread to be 2*(spacing between the training data points), I get good answer. This logic doesn't work in step function. Are there certain guidelines for spread for different kind of functions?

Thanks a lot for your reply!!

From: Greg Heath on 11 Nov 2009 23:47

On Nov 11, 9:48 am, "Balwinder Singh" <balwindersi...(a)gmail.com>
wrote:
> Greg Heath <he...(a)alumni.brown.edu> wrote in message <05f8fa2c-0e40-4bf0-ab08-478e305c0...(a)m13g2000vbf.googlegroups.com>...
> > On Nov 9, 5:25?pm, "Balwinder Singh" <balwindersi...(a)gmail.com> wrote:
> > > Hi All,
> > > In an RBF neural network what is the best way to get the spread?
>
> > Classification or regression?
>
> -- I am working on a regression problem
>
> > Dimensionality of the input space?
>
> -- I have 2D input
>
> > Number of training vectors?
>
> -- Number of training vectors is 21
>
> > > In my problem, it seems like if I use the default spread (=1.0) I get spikes at
> > > the points here data is available and constant values otherwise.

I'm not surprised. Spread should depend on the scale of the data.
When the distance from a hidden node neuron increases as

d = [0:4]*spread,

the corresponding hidden node value decreases as

radbas(sqrt(log(2))*d/spread) = [1 0.5 0.0625 0.0020
0.0000]

Therefore, a search for the optimum spread in an interval including
0.5*min(dist(p,p') ) is not unreasonable.

> > How many hidden nodes generated?
>
> --The SSE doesn't go below the goal (10^-5) ,

Well, what value do you get?

What makes you think 1e-5 is a reasonable number?

> therefore the network generates 20 hidden nodes.
> At the 20th hidden node, I get a huge drop in the residual.

By memorizing the training data. Not very useful.

If your model produced a constant output y = mean(t),
none of the output variance would be modeled. Whereas
more than 99% of the output variance would be modeled
with

MSE < MSEgoal = mse(t-mean(t))/100 < var(t)/100

Therefore, a reasonable input choice is

goal = SSEgoal = sse(t-mean(t))/100

> > If I increase spread to 10.0, I get a regular curve
>
> > How many hidden nodes generated?
>
> --Same as above
>
> > but I am not sure how to get an optimal value for spread.
>
> > Trial and error.

....using a reasonable value for SSEgoal.

> > > Also, if I over-fit the data by using max number of neurons, then the SSE
> > > approaches zero (at the very last neuron addition) otherwise, the SSE stay at > > > a huge value (~400.0). There is a huge drop in SSE at the point when I add > > > the last neuron to the network. Is this typical?
>
> > No.
>
> ---Does that mean there is a problem in the algorithm (I am using my Fortran
> code based on the MATLAB's newrb function) or I just need to play with
> the "spread"?

Obtain a "feel for the data:

Obtain a contour plot of the data.
Superimpose x-y data points.
Calculate dist(p,p').
etc

> > More details needed.
>
> -- I am using 2D input with a spacing of 15 in x-direction and a spacing of 30 in
> the y-direction (training set).

Seems to indicate that searching for an optimum spread
in an interval containing 7.5 and 15 is reasonable.

What is minmax(p)?

> The spread values I tried so far, are 1 and 10.

Why so few? Why no consideration of data spacing or extent?

Since the covariance matrix of the radial basis function is scalar,
why not standardize the data?

> I also did some testing on monotone (increasing) functions such as square root
> of a number. It seems like, for sparse data, if I use spread to be 2*(spacing
> between the training data points), I get good answer. This logic doesn't work in
> step function. Are there certain guidelines for spread for different kind of
> functions?

Probably would want important data to be within 1 or 2 spreads of the
nearest hidden node center.

Hope this helps.

Greg

From: Balwinder Singh on 12 Nov 2009 10:23

Thanks Greg!! It was really insightful and helpful. I will get back if I have any other questions. Thanks a lot!!

| Next | Last
Pages: 1 2
Prev: problems with bwboundaries
Next: Pixel Scan