From: Tim on
Is this really really the sweet missing piece I've been waiting for since I've started using PCT?
Will dload() distribute itself to the workers "automagically" without the entire matrix going into the memory of the head worker?
Can I just call dload() on a 60GB .mat file and have it automatically load itself onto 60 different workers as a distributed matrix?
From: Edric M Ellis on
"Tim " <evitaerc(a)gmail.com> writes:

> Is this really really the sweet missing piece I've been waiting for since I've
> started using PCT? Will dload() distribute itself to the workers "automagically"
> without the entire matrix going into the memory of the head worker? Can I just
> call dload() on a 60GB .mat file and have it automatically load itself onto 60
> different workers as a distributed matrix?

Glad you like it! Yes, the intention is that DLOAD can load a really
huge .MAT file and distribute it directly into a DISTRIBUTED array (the
client-side view of the CODISTRIBUTED array) without overloading the
memory of the client. It loads chunks (128MB) of the .MAT file and then
sends them directly to the appropriate worker.

One constraint is that the .MAT file must be saved with -v7.3 (DSAVE
does this automatically). Also, if you have a .MAT file with "standard"
MATLAB arrays in there, you'll want the "dload -scatter" variant.

I should point out that we're not yet doing anything sophisticated using
parallel filesystems and MPI-IO, so the current implementation of
DSAVE/DLOAD is throttled by the rate at which the client can read the
data from disk and then send it to the workers. Is this something you're
interested in? Do you already have a parallel filesystem on your
cluster?

Cheers,

Edric.
From: Tim on
Yes Edric, that is definitely something I hope gets implemented in the future (along with sparse distributed matrices eventually supporting inversion via MUMPS or whatever). We do large PDE-constrained optimization problems, and currently go outside of MATLAB to do our PDE modeling. dload() will be fantastic for loading in the large (~10GB) mata generated by these kind of programs. Since the modeling is done many times over the course of the inversion process, it'll be great if the IO overhead is greatly reduced as well, by having nodes directly fetch data fragments. AFAIK we use shared filesystems for all the worker nodes and I'm currently aware of no other lab that isn't likewise.
From: Edric M Ellis on
"Tim " <evitaerc(a)gmail.com> writes:

> Yes Edric, that is definitely something I hope gets implemented in the
> future (along with sparse distributed matrices eventually supporting
> inversion via MUMPS or whatever). We do large PDE-constrained
> optimization problems, and currently go outside of MATLAB to do our
> PDE modeling. dload() will be fantastic for loading in the large
> (~10GB) mata generated by these kind of programs. Since the modeling
> is done many times over the course of the inversion process, it'll be
> great if the IO overhead is greatly reduced as well, by having nodes
> directly fetch data fragments.

Thanks for sharing your use-case.

> AFAIK we use shared filesystems for all the worker nodes and I'm
> currently aware of no other lab that isn't likewise.

Right, but there are certain filesystems such as lustre and pvfs2 which
support high performance parallel access - do you know if you have one
of those?

Cheers,

Edric.
From: Tim on
Edric M Ellis <eellis(a)mathworks.com> wrote in message <ytwiq8n5p30.fsf(a)uk-eellis-deb5-64.mathworks.co.uk>...
> Right, but there are certain filesystems such as lustre and pvfs2 which
> support high performance parallel access - do you know if you have one
> of those?
>

Yep, we're using GPFS on an infiniband backbone