From: Hamdi Abdelouahed Abdallah on
i have a problem with huge dataset, when i train the neural network with nprtool it would give an "Out of memory" error

i would ask if i can train the network partially ?
From: Greg Heath on
On Apr 18, 4:17 pm, "Hamdi Abdelouahed Abdallah"
<abd_ham_j...(a)yahoo.fr> wrote:
> i have a problem with huge dataset, when i train the  neural network with nprtool it would give an "Out of memory" error
>
> i would ask if i can train the network partially ?

The correct approach depends on the problem
Please characterize yours:

1. Regression or Classification?
2. Size of training, evaluation and testings subsets
Ntrn,Nval,Ntst?
3. Dimensionality of input and output vectors, I,O?
4. Number of hidden nodes,H?
5. Training algorithm and objective function?

If you are not using regularization or Early Stopping,
the ratio of the number of training equations, Neq,.
to the number of unknown weights, Nw, should be sufficiently large so
that

r = Neq/Nw >> 1
for Neq = Ntrn*O
and
Nw = (I+1)*H+(H+1)*O = O+(I+O+1)*H

Many successful designs have

~ 5 <= r <= ~ 30

The optimal size will depend on the complexity of the
trend of the data and the amount of measurement noise.
It is best found by trial and error.

So, choose a value of r. Then for each candidate
value of H you can estimate a sufficient minimum
size for the training subset.

In addition, you just need to make sure that the chosen
I-dimensional training subset is representative of
a random draw.

Hope this helps.

Greg
From: Hamdi Abdelouahed Abdallah on
it's a pattern recognition neural network used for predicting secondary structure of proteins (classification)

dataset contain <357*95000 double> inputs and <3*95000 double>

70% for training , 15% for valdation , 15% for test

1 hidden layer with 5 neurons