From: Greg Heath on
PLEASE DO NOT TOP POST. TYPE ALL REPLIES WITHIN OR BELOW
THE PREVIOUS POST.

% On Apr 24, 1:00 am, "Adham " <atya0...(a)flinders.edu.au> wrote:
> Dear Greg
>
> thanks for you help. I was performing some extra tests and I
> realized that after performing SVD, the results come similar
> to the results of random data sets. Later on, I also did
> some down sampling on my data and without doing any data
> decomposition or feature extraction, the results with a
> basic tansig,logsig, trainscg neaural net reached to 70>.
> considering that my data is coming from a multidimension
> matrix (actually it is a cell array) of 118x2500x280 and
> my target is 2x280, what do you suggest for feature
> extraction. I need to reduce the size of my input data (P)
> before sending it to neural net or the accuracy drops
> dramatically. what I am doing now is performing 3 step of
> unfolding+svd+refolding in this way:
> [i,j,k]=size(P);% i=118, j=2500,k=280
> [d,dimsize,dimorder]=unfold(P,1,[2 3]); % d=118x700000
> [u,s,v]=svd(d,'econ');
> d=s*v';
> P=refold(d,dimsize,dimorder);% 118x2500x280
> [d,dimsize,dimorder]=unfold(P,2,[1 3]); % d=2500x33040
> [u,s,v]=svd(d,'econ');
> d=s*v';
> P=refold(d,dimsize,dimorder);% 118x2500x280
> [d,dimsize,dimorder]=unfold(P,3,[1 2]); % d=280x295000
> [u,s,v]=svd(d,'econ');
> d=s*v';
> P=refold(d,dimsize,dimorder);% 118x2500x280

I don't understand what you are doing. The purpose of the
SVD is to reduce the dimensionality of the N = 280 input
vectors of dimensionality M = 29500. Project the input
vectors into the space spanned by the Q singular vectors
corresponding to singular values larger than a threshold.
Typically, the threshold is chosen to either
1. (preferred) keep the sum of the singular values above
a specified fraction (e.g., 0.99 or 0.975 or ..) of
the original sum.
or
2. keep all singular vectors that exceed a specified
percentage (e.g., 0.01 )of the largest

>after this, I will down sample the data by factor of 2 in
>the second dimension, reshape it to a 2 dimensional matrix,

It might be better to subsample the original (29500,280)
matrix and perform SVD on the smaller subsamples.

>it to three sub datasets for training, testing and validation
>(Using SRS function that I explained before) and then pass it to
>nn. However, the size of the final P (before breaking it to three
>parts) is still huge and looks like nn memorize the data instead
> of adapting wights (if it is a good way to describe it).

No. It is not. Memorization occurs when the size
of the data set is small and the number of training
equations Neq = 280 (assuming a single output for two
classes) is not sufficiently larger than the number of
weights to be estimated (Nw = (I+1)*H+(H+1)*1 for an
% I-H-1 topology).

Search the goup archives using

greg heath Neq Nw


> I would
>appreciate it if you could give me some suggestion about the way
>I am preparing my data for nn.

See above

>I also nedd to know how to
>determine the best down sampling size/factor.

Trial and error

PLEASE DO NOT TOP POST. TYPE ALL REPLIES WITHIN OR BELOW
THE PREVIOUS POST.

Hope this helps.

Greg