From: Lee Samuel Finn on
Dear All,

I'm trying to deploy a parallel application using R2009b; however, I'm not meeting with any success. Any help you can offer will be greatly appreciated.

The cluster runs the torque scheduler and shares its filesystem with the headnode. My application is compiled on the cluster headnode, with has the parallel computing, distributed computing and compiler toolboxes installed. It is intended to run as a matlabpool job. I've installed my own copy of the R2009b MCR and set the scheduler object ClusterMatlabRoot attribute to the location of the MCR. I have also set and exported MCR_CACHE_ROOT in order that the nodes all know exactly where the application CTF is.

I do not use configurations to set the scheduler, job or task properties: I set these by hand in the compiled code.

My application contacts the scheduler ok and submits the job just fine. When the job goes to execute it tries to execute mw_smpd on each of the worker nodes. This step fails as there is no mw_smpd to execute, either in the MCR or the CTF: i.e., in the torque log file I find lines like

rsh lionxk10 "/gpfs/work/lsf5/MATLAB_Compiler_Runtime/v711/bin/mw_smpd" -s -phrase MATLAB -port 20687
bash: /gpfs/work/lsf5/MATLAB_Compiler_Runtime/v711/bin/mw_smpd: No such file or directory

How do I go compiling and deploying a parallel job on a cluster?

Thanks again for your help,

Sam
From: Raymond Norris on
Lee,

The ClusterMatlabRoot should be set to where MDCS is installed, not the MCR.

Raymond

"Lee Samuel Finn" <lsfinn+matlab(a)psu.edu> wrote in message <hgn2fu$q90$1(a)fred.mathworks.com>...
> Dear All,
>
> I'm trying to deploy a parallel application using R2009b; however, I'm not meeting with any success. Any help you can offer will be greatly appreciated.
>
> The cluster runs the torque scheduler and shares its filesystem with the headnode. My application is compiled on the cluster headnode, with has the parallel computing, distributed computing and compiler toolboxes installed. It is intended to run as a matlabpool job. I've installed my own copy of the R2009b MCR and set the scheduler object ClusterMatlabRoot attribute to the location of the MCR. I have also set and exported MCR_CACHE_ROOT in order that the nodes all know exactly where the application CTF is.
>
> I do not use configurations to set the scheduler, job or task properties: I set these by hand in the compiled code.
>
> My application contacts the scheduler ok and submits the job just fine. When the job goes to execute it tries to execute mw_smpd on each of the worker nodes. This step fails as there is no mw_smpd to execute, either in the MCR or the CTF: i.e., in the torque log file I find lines like
>
> rsh lionxk10 "/gpfs/work/lsf5/MATLAB_Compiler_Runtime/v711/bin/mw_smpd" -s -phrase MATLAB -port 20687
> bash: /gpfs/work/lsf5/MATLAB_Compiler_Runtime/v711/bin/mw_smpd: No such file or directory
>
> How do I go compiling and deploying a parallel job on a cluster?
>
> Thanks again for your help,
>
> Sam