From: pratip on
Hallo Group,

Here is a harmless piece of code.

Clear[a];
t=AbsoluteTime[];
a=ParallelTable[0,{i,1,20},{j,1,50}];
DistributeDefinitions[a];
mat=ParallelDo[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,
1,50}];
AbsoluteTime[]-t

The error is something like

Set::noval: Symbol a in part assignment does not have an immediate value.
Set::noval: Symbol a in part assignment does not have an immediate value.

I know there is a way of using SetSharedVariable but that version of
the code is very slow.

Clear[a];
t=AbsoluteTime[];
a=ParallelTable[0,{i,1,20},{j,1,50}];
SetSharedVariable[a];
mat=ParallelDo[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,
1,50}];
Print[a//ArrayPlot];
AbsoluteTime[]-t

Time taken: 6.1443514

The single processor version is much faster

Clear[a];
t=AbsoluteTime[];
a=Table[0,{i,1,20},{j,1,50}];
mat=Do[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,1,50}];
Print[a//ArrayPlot];
AbsoluteTime[]-t

Time taken: 0.0360021

My question is why I cant distribute the definition of a array to my
processor kernels.
Hope someone can give me an answer. I need to make this Do loop
parallel.

Yours,

Pratip

From: Patrick Scheibe on
Hi,

you know what a shared variable is? You know that there's a lot to do
for the ParallelDo to manage exclusive write-accesses to your "a" for
every single subkernel?
What you want is to "build the parts of the solution list and to merge
them to the final solution at the end".

{t1, mat1} =
AbsoluteTiming[
ParallelTable[
If[PrimeQ[(i^3 + j^5)] == True, 1, 0], {i, 1, 20}, {j, 1, 50}]];

Your code is not what you really want at different places:

- why you are initializing an array with 0 and storing your elements in
it? You just have to build your matrix.
- what do you think is in your variable "mat"?
- using ParallelTable to initialize/allocate an array is really making
the stuff slow. There's nothing to do for the subkernels and the
overhead of parallelization is far too big.

Btw, to really take advantages of parallelization your calculations
inside the loop should really be more complex. This makes the the
subkernels really have to compute something and they are not kept busy
moving memory-contents back and forth.

Hope this helps a bit.

Cheers
Patrick

On Mon, 2010-04-05 at 08:01 -0400, pratip wrote:
> Hallo Group,
>
> Here is a harmless piece of code.
>
> Clear[a];
> t=AbsoluteTime[];
> a=ParallelTable[0,{i,1,20},{j,1,50}];
> DistributeDefinitions[a];
> mat=ParallelDo[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,
> 1,50}];
> AbsoluteTime[]-t
>
> The error is something like
>
> Set::noval: Symbol a in part assignment does not have an immediate value.
> Set::noval: Symbol a in part assignment does not have an immediate value.
>
> I know there is a way of using SetSharedVariable but that version of
> the code is very slow.
>
> Clear[a];
> t=AbsoluteTime[];
> a=ParallelTable[0,{i,1,20},{j,1,50}];
> SetSharedVariable[a];
> mat=ParallelDo[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,
> 1,50}];
> Print[a//ArrayPlot];
> AbsoluteTime[]-t
>
> Time taken: 6.1443514
>
> The single processor version is much faster
>
> Clear[a];
> t=AbsoluteTime[];
> a=Table[0,{i,1,20},{j,1,50}];
> mat=Do[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,1,50}];
> Print[a//ArrayPlot];
> AbsoluteTime[]-t
>
> Time taken: 0.0360021
>
> My question is why I cant distribute the definition of a array to my
> processor kernels.
> Hope someone can give me an answer. I need to make this Do loop
> parallel.
>
> Yours,
>
> Pratip
>


From: Albert Retey on
Hi,

>
> Here is a harmless piece of code.
>
> Clear[a];
> t=AbsoluteTime[];
> a=ParallelTable[0,{i,1,20},{j,1,50}];
> DistributeDefinitions[a];
> mat=ParallelDo[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,
> 1,50}];
> AbsoluteTime[]-t
>
> The error is something like
>
> Set::noval: Symbol a in part assignment does not have an immediate value.
> Set::noval: Symbol a in part assignment does not have an immediate value.
>
> I know there is a way of using SetSharedVariable but that version of
> the code is very slow.
>
> Clear[a];
> t=AbsoluteTime[];
> a=ParallelTable[0,{i,1,20},{j,1,50}];
> SetSharedVariable[a];
> mat=ParallelDo[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,
> 1,50}];
> Print[a//ArrayPlot];
> AbsoluteTime[]-t
>
> Time taken: 6.1443514
>
> The single processor version is much faster
>
> Clear[a];
> t=AbsoluteTime[];
> a=Table[0,{i,1,20},{j,1,50}];
> mat=Do[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,1,50}];
> Print[a//ArrayPlot];
> AbsoluteTime[]-t
>
> Time taken: 0.0360021
>
> My question is why I cant distribute the definition of a array to my
> processor kernels.

If you want to change the same variable from different kernels, you
_have_ to use SetSharedVariable, and there _must_ be some kind of
synchronisation between the kernels and this _must_ cause overhead. If
you want speed, you need to take another approach, e.g. create a part of
the table on each kernel and only join the results in the master kernel.
It really depends on your problem and you need to take advantage of what
you know about the problem, automatic parallelization will often fail to
show speedup.

You should also realize that there is _always_ overhead when
parallelizing code, so you can only hope to see speedup if the
calculation times are larger than the expected overhead from
parallelization. For a problem that solves in 0.0360021 Seconds on one
kernel I doubt there is a chance to see any speedup with parallelization
at all (and that holds not only for Mathematica)...

> Hope someone can give me an answer. I need to make this Do loop
> parallel.

for the toy problem you sent there is no chance to see speedup
whatsover. If your real problem is larger, you should show something
that at least takes a few seconds to run, otherwise it will usually not
be possible to see whether parallelization is gaining anything at all.
You should also not add the timing for ArrayPlot, eventual speedups will
be even harder to see with it. For what you have shown, I wonder why you
don't just use:

a=ParallelTable[If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,1,50}]

which does not have the problem of synchronization and doesn't need
SetSharedVariable. On the other hand the problem still solves in 0.036
Seconds on one kernel, and you will still suffer from overhead and not
see any speedup... On my 2 processor machine, the parallel version
starts to be faster than the serial for i,j about 200:

start = AbsoluteTime[];
a1 = Table[
If[PrimeQ[(i^3 + j^5)] == True, 1, 0], {i, 1, 300}, {j, 1, 300}];
AbsoluteTime[] - start

1.1875000

start = AbsoluteTime[];
a2 = ParallelTable[
If[PrimeQ[(i^3 + j^5)] == True, 1, 0], {i, 1, 300}, {j, 1, 300}
];
AbsoluteTime[] - start

0.8906250

for i=j=1000 the parallel version shows a speedup of 14.9/8.7=1.7, which
is already quite close the maximum of 2 that you could expect.

hth,

albert


From: Zach Bjornson on
Hello Pratip,

You can use ParallelTable instead of ParallelDo. Even in non-parallel
mode, Table is faster than Do (here it worked out to about 1.4x faster).

a = ParallelTable[If[PrimeQ[(i^3 + j^5)] == True, 1, 0], {i, 1, 20}, {j,
1, 50}]

However, when I tested this with your tiny range of i and js, there was
no speedup. Change to {i, 200} and {j, 500} gave a 2.42x speedup in
parallel.

-Zach

On 4/5/2010 8:01 AM, pratip wrote:
> Hallo Group,
>
> Here is a harmless piece of code.
>
> Clear[a];
> t=AbsoluteTime[];
> a=ParallelTable[0,{i,1,20},{j,1,50}];
> DistributeDefinitions[a];
> mat=ParallelDo[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,
> 1,50}];
> AbsoluteTime[]-t
>
> The error is something like
>
> Set::noval: Symbol a in part assignment does not have an immediate value.
> Set::noval: Symbol a in part assignment does not have an immediate value.
>
> I know there is a way of using SetSharedVariable but that version of
> the code is very slow.
>
> Clear[a];
> t=AbsoluteTime[];
> a=ParallelTable[0,{i,1,20},{j,1,50}];
> SetSharedVariable[a];
> mat=ParallelDo[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,
> 1,50}];
> Print[a//ArrayPlot];
> AbsoluteTime[]-t
>
> Time taken: 6.1443514
>
> The single processor version is much faster
>
> Clear[a];
> t=AbsoluteTime[];
> a=Table[0,{i,1,20},{j,1,50}];
> mat=Do[a[[i,j]]=If[PrimeQ[(i^3+j^5)]==True,1,0],{i,1,20},{j,1,50}];
> Print[a//ArrayPlot];
> AbsoluteTime[]-t
>
> Time taken: 0.0360021
>
> My question is why I cant distribute the definition of a array to my
> processor kernels.
> Hope someone can give me an answer. I need to make this Do loop
> parallel.
>
> Yours,
>
> Pratip
>
>