From: ky on
Dear all,

I'm very new to the parallel computation using Matlab and I have several questions regarding the efficient parallel computations using a single Multi-Core workstation.
My situation is:

Aim:
Run exact same codes in parallel (~16 or more) where each code takes about a day for the execution.

Single Workstation:
OS: openSUSE (64bits)
16 CPUs : Intel(R) Xeon(R) Processor E5520 with 4 cores & 8 threads each.
4 x 16 = 64 cores in total
8 x 16 = 128 threads in total
maxNumCompThreads can set the total number of threads upto 512 (I'm not very sure why the maximum can exceed 128 but it might be that Matlab can hold 8 threads in a single core.)

Matlab:
R2010a
Parallel Computing Toolbox 4.3
(Distributed Computing Server 4.3 not yet installed)

My questions are:
1. By using the parallel computing tool box local scheduler, I have used 8 workers to run 8 codes in parallel. By referring to the other posts, each worker corresponds to single thread. So, this means I'm wasting the rest of 128 - 8 (or even 512 - 8) threads of computational power??

2. By following the other posts, I have edited jobStartup.m file to allow multithreading at each worker: 1, 2, 4, 'automatic' so on. However, the best performance was achieved by using the single thread each (same as default.) Even though I'm not using many multithreading-supported functions, how could the use of multithreading slow down the execution time?? Is this because workers could share the common threads, or am I missing something??

3. Since I would like to run more than 8 codes in parallel in our application, I'm seeking a way to do this:

- I have run 2 pools of workers from 2 Matlab sessions (2 x 8 = 16 workers in total) by using parfor, or equivalently 2 job submissions with 8 tasks each from 2 independent Matlab sessions. Both cases, the execution time has increased by 1.7 compared to the 8 parallel case. So I assume creating 16 workers this way does not really result in 16 workers but 8 workers executing 2 codes in parallel. Is there any way to create more than 8 workers on the local machine without having Distributed Computing Server installed?? I'm able to run Matlabroot/toolbox/distcomp/bin/admincenter and I'm aware that the use of local scheduler does not need the Distributed Computing Server license. So, I'm wondering if it will ever be possible to create a job manager similar to the local scheduler but more than 8 workers.

- If the above approaches can not be done, I'm thinking to install the Distributed Computing Server. In principle, the workstation should be able to run upto ~128 (or 512) workers without noticeable degrading by considering the single worker single thread correspondence (as long as OS does good job distributing the tasks.) My question is:
Can Distributed Computing Sever handle more than 8 workers on a "local" machine and hence do not waste all its computation resources?? If so, is this going to be any great in terms of the execution time??

Many many thanks in advance,
ky
From: ky on
"ky " <kyasui(a)uchicago.edu> wrote in message <hotklf$79u$1(a)fred.mathworks.com>...

Hi all, I have made a big mistake in the post, here is the update:

> Single Workstation:
> OS: openSUSE (64bits)
> 16 CPUs : Intel(R) Xeon(R) Processor E5520 with 4 cores & 8 threads each.
> 4 x 16 = 64 cores in total
> 8 x 16 = 128 threads in total

There are "2" physical CPUs, Intel(R) Xeon(R) Processor E5520 with 4 cores & 8 threads each, therefore 16 threads in total!!

I'm still very interested in running at least 16 codes in parallel on the local machine, and I really appreciate any advices and suggestions. Thank you very much in advance!!

ky
From: Zhenghua on
Hi, ky.
I have the same situation as yours. Hope somebody can give the answer. Many thanks.


"ky " <kyasui(a)uchicago.edu> wrote in message <hq5of5$1d5$1(a)fred.mathworks.com>...
> "ky " <kyasui(a)uchicago.edu> wrote in message <hotklf$79u$1(a)fred.mathworks.com>...
>
> Hi all, I have made a big mistake in the post, here is the update:
>
> > Single Workstation:
> > OS: openSUSE (64bits)
> > 16 CPUs : Intel(R) Xeon(R) Processor E5520 with 4 cores & 8 threads each.
> > 4 x 16 = 64 cores in total
> > 8 x 16 = 128 threads in total
>
> There are "2" physical CPUs, Intel(R) Xeon(R) Processor E5520 with 4 cores & 8 threads each, therefore 16 threads in total!!
>
> I'm still very interested in running at least 16 codes in parallel on the local machine, and I really appreciate any advices and suggestions. Thank you very much in advance!!
>
> ky