From: Frederik Grosserueschkamp on
Hi,

I have the following problem:

I want to do a pdist with an large array [640*512*385] but every time I get this error:

*****
??? Error using ==> pdistmex
Distance matrix has more elements than the maximum allowed size in MATLAB.

Error in ==> pdist at 211
Y = pdistmex(X',dist,additionalArg);

Error in ==> compute_mat_hierarchy at 15
M=pdist(C,'correlation');

Error in ==> puzzle_spectra at 34
[E Z]=compute_mat_hierarchy(D,rng,file);
*****

I know it hase something to do with the MWSIZE_MAX but I'm not able to solve the problem.
Here some data for the used system:

Matlab R2009a 64bit
Suse Linux 11.1 64bit

I would be able to understand that there is not enough memory for the linkage but why I get this error and how to overcome it if possible.

Thanks in advance for your help.

Greetings

Frederik
From: Steven Lord on

"Frederik Grosserueschkamp" <fred(a)bph.rub.de> wrote in message
news:hcf6n7$drn$1(a)fred.mathworks.com...
> Hi,
>
> I have the following problem:
>
> I want to do a pdist with an large array [640*512*385] but every time I
> get this error:

Are you passing a 3-D array into PDIST? If so, that shouldn't work -- you
should receive an error, as PDIST is intended for 2-D matrices.

> *****
> ??? Error using ==> pdistmex
> Distance matrix has more elements than the maximum allowed size in MATLAB.

Assuming that your matrix is (640*512)-by-385 [i.e. you have 327680 points
in 385-D space] the resulting matrix would require a contiguous block of
memory:

>> entries = (327680*(327680-1)/2);
>> bytes = entries*8;
>> gigabytes = bytes/(1024^3);

about 400 gigabyte in size. To put that into perspective, take a look at
the examples here:

http://en.wikipedia.org/wiki/Gigabyte

Assuming you were able to create this matrix, and were able to process 1000
entries per second, it would take you:

>> days = (entries/1000)/(60*60*24)

about 1 3/4 years to process the distances.

> I would be able to understand that there is not enough memory for the
> linkage but why I get this error and how to overcome it if possible.

If the solution to the original problem for which you're computing this
distance matrix involves processing each of the distances in turn, I would
explore alternate methods to solve that problem.

--
Steve Lord
slord(a)mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ


From: Frederik Grosserueschkamp on
"Steven Lord" <slord(a)mathworks.com> wrote in message <hcfcj8$g23$1(a)fred.mathworks.com>...
>
> "Frederik Grosserueschkamp" <fred(a)bph.rub.de> wrote in message
> news:hcf6n7$drn$1(a)fred.mathworks.com...
> > Hi,
> >
> > I have the following problem:
> >
> > I want to do a pdist with an large array [640*512*385] but every time I
> > get this error:
>
> Are you passing a 3-D array into PDIST? If so, that shouldn't work -- you
> should receive an error, as PDIST is intended for 2-D matrices.
>

surely for pdist it is a 327680 * 385 sized matrix

> > *****
> > ??? Error using ==> pdistmex
> > Distance matrix has more elements than the maximum allowed size in MATLAB.
>
> Assuming that your matrix is (640*512)-by-385 [i.e. you have 327680 points
> in 385-D space] the resulting matrix would require a contiguous block of
> memory:
>
> >> entries = (327680*(327680-1)/2);
> >> bytes = entries*8;
> >> gigabytes = bytes/(1024^3);
>
> about 400 gigabyte in size. To put that into perspective, take a look at
> the examples here:
>
> http://en.wikipedia.org/wiki/Gigabyte
>
> Assuming you were able to create this matrix, and were able to process 1000
> entries per second, it would take you:
>
> >> days = (entries/1000)/(60*60*24)
>
> about 1 3/4 years to process the distances.
>

puh that's long *looking for other possibilities*

> > I would be able to understand that there is not enough memory for the
> > linkage but why I get this error and how to overcome it if possible.
>
> If the solution to the original problem for which you're computing this
> distance matrix involves processing each of the distances in turn, I would
> explore alternate methods to solve that problem.
>
> --
> Steve Lord
> slord(a)mathworks.com
> comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
>

Thanks for your answer. It's inspiring.

But I don't get the point why I get especially this error with the pdistmex, why not a out of memory or something like this?
From: Steven Lord on

"Frederik Grosserueschkamp" <fred(a)bph.rub.de> wrote in message
news:hcfe77$r01$1(a)fred.mathworks.com...
> "Steven Lord" <slord(a)mathworks.com> wrote in message
> <hcfcj8$g23$1(a)fred.mathworks.com>...

*snip*

> Thanks for your answer. It's inspiring.
>
> But I don't get the point why I get especially this error with the
> pdistmex, why not a out of memory or something like this?

Take a look at the second output of the COMPUTER function and take a look at
how many entries the output of PDIST would have had (as described in HELP
PDIST.)

For matrices that are larger than the largest contiguous block of memory you
have available, you'll receive an out of memory error.
For matrices that are even larger, MATLAB doesn't even try to create them --
it knows it won't be able to and so just throws the error you received or
something similar.

--
Steve Lord
slord(a)mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ