From: Alex Poe on 1 Aug 2010 15:22 Hello, I'm considering trying out the parallel toolbox first to solve a simple (though large) system Ax = b. MATLAB comes with an excellent demo showing how solve this system using spmd. I was able to modify the demo code so that the matrix A and the vector b are what I want them to be. I tested the program on a 2 quadcore CPU server using 'matlabpool start local' (so the labs are all on the same machine) - no problem, it solves the system correctly. My question is this: I would like to do the same but on a distributed cluster. My school has one, and I have access to it. In fact I've been using it for quite some time, but my programs were all in C and used ScaLAPACK. I would log in to the cluster's shell, qsub my program, log off, and then wait for an email notification from the cluster that my program has been executed. How does this work with MATLAB? Do I submit the program from within MATLAB? Do I have to stay logged in (and have MATLAB running) until the job is executed? Would appreciate any help! Thanks, --a.
From: Edric M Ellis on 2 Aug 2010 06:18 "Alex Poe" <wasteoff.nospam(a)gmail.com> writes: > I'm considering trying out the parallel toolbox first to solve a simple (though > large) system Ax = b. MATLAB comes with an excellent demo showing how solve this > system using spmd. I was able to modify the demo code so that the matrix A and > the vector b are what I want them to be. I tested the program on a 2 quadcore > CPU server using 'matlabpool start local' (so the labs are all on the same > machine) - no problem, it solves the system correctly. My question is this: I > would like to do the same but on a distributed cluster. My school has one, and I > have access to it. In fact I've been using it for quite some time, but my > programs were all in C and used ScaLAPACK. I would log in to the cluster's > shell, qsub my program, log off, and then wait for an email notification from > the cluster that my program has been executed. How does this work with > MATLAB? To run across multiple nodes, you need "MATLAB Distributed Computing Server" on the cluster. See: http://www.mathworks.com/products/distriben/ This allows you to submit a job to a remote cluster - you can then submit a job to multiple nodes to run your SPMD block across those nodes. (FYI - we use ScaLAPACK behind the scenes to solve "Ax = b" on the cluster). You mention "qsub" - are you using SGE or Torque? We have built-in integration with Torque; setting up SGE takes a little more work, but there are instructions telling you exactly what you need to do here. > Do I submit the program from within MATLAB? Do I have to stay logged in (and > have MATLAB running) until the job is executed? If you use MDCS, then you do not need to stay logged in and you can collect your results later. Cheers, Edric.
From: Matt J on 2 Aug 2010 07:07 Edric M Ellis <eellis(a)mathworks.com> wrote in message <ytwlj8p2u8h.fsf(a)uk-eellis-deb5-64.mathworks.co.uk>... > > To run across multiple nodes, you need "MATLAB Distributed Computing > Server" on the cluster. See: > > http://www.mathworks.com/products/distriben/ > > This allows you to submit a job to a remote cluster - you can then > submit a job to multiple nodes to run your SPMD block across those > nodes. ================== I have a related question. I was told by a TMW sales rep that the Distributed Computing Server allowed you to use parfor with more than 8 workers, but not necessarily on a remote cluster. It could just be used to scale up the number of local workers usuable by the Parallel Computing Toolbox. Can anyone confirm that?
From: Matt J on 2 Aug 2010 07:15 "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <i368sr$hgs$1(a)fred.mathworks.com>... > > I have a related question. I was told by a TMW sales rep that the Distributed Computing Server allowed you to use parfor with more than 8 workers, but not necessarily on a remote cluster. It could just be used to scale up the number of local workers usuable by the Parallel Computing Toolbox. > > Can anyone confirm that? ================ For that matter, I was led to believe that with the Distributed Computing Server, you could split a job across any combination of local/remote workers, e.g. you could chain two 8 core machines together and parallelize across all 16 cores. True?
From: Edric M Ellis on 2 Aug 2010 12:23 "Matt J " <mattjacREMOVE(a)THISieee.spam> writes: > "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <i368sr$hgs$1(a)fred.mathworks.com>... > >> >> I have a related question. I was told by a TMW sales rep that the >> Distributed Computing Server allowed you to use parfor with more than >> 8 workers, but not necessarily on a remote cluster. It could just be >> used to scale up the number of local workers usuable by the Parallel >> Computing Toolbox. >> >> Can anyone confirm that? > ================ > > For that matter, I was led to believe that with the Distributed >Computing Server, you could split a job across any combination of >local/remote workers, e.g. you could chain two 8 core machines together >and parallelize across all 16 cores. True? That is true - but note that the MDCS workers will be drawn from a different "pool" than the "local" workers from PCT. With MDCS, you need to set up workers (possibly using the "jobmanager"), and you can place them where you wish. For example, you could run 16 MDCS workers across two machines, one of which is a desktop machine. You could then run PARFOR across them all. Cheers, Edric.
|
Next
|
Last
Pages: 1 2 Prev: plotting problem Next: How to train a neural network with an empty input? |