From: pratip on 23 Sep 2009 23:50 Hi Everybody, Recently I was looking through many parallel computation example in the documentation of Mathematica 7.0.1. If not very clear and adequate those documentation looks pretty impressive at the first glance. Hence I decided to do some Mathematica implementation of the small piece of software named Super Pi which is very famous among the common over clockers. It computes Pi up to a user defined decimal digits but in parallel using all the cores of your processor. Have look http://files.extremeoverclocking.com/file.php?f=36 So my goal was to write a pure Mathematica code that computes Pi up to three million decimal digits eight times in parallel using the eight kernels available in my pc. However to compute this task once in my pc it requires just around 3.885 seconds (with Intel Core i7 975 extreme processor). fun[n_]:=Module[{a,tic,toc}, tic=TimeUsed[]; a=N[Pi,n*10^6]; toc=TimeUsed[]; toc-tic ]; (*For 3 million decimal digits*) In[24]:= fun[3] Out[24]= 3.885 Now let's see the parallel configuration of the PC. One can see that I indeed have eight kernels present in the system. In[16]:= ParallelEvaluate[$ProcessID] Out[16]= {6712,6636,7928,4112,7196,5832,3992,7484} In[17]:= ParallelEvaluate[$MachineName] Out[17]= {flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher- pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc} Now to compute the same thing eight times but in parallel I tried the following combinations with no success at all. See yourself the disappointing timing results. First: In[2]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun[#] &,b],Method->"CoarsestGrained"]; toc=TimeUsed[]; toc-tic Out[4]= 30.935 Second: In[11]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun[#= ] &,b],Method->"FinestGrained"]; toc=TimeUsed[]; toc-tic Out[13]= 30.872 Third: In[18]:= ParallelMap[fun[#] &, b] // Timing Out[18]= {30.81, {3.884, 3.822, 3.854, 3.853, 3.837, 3.869, 3.822, 3.869}} Fourth: In[21]:= ParallelTable[fun[3],{i,1,8}]//Timing Out[21]= {30.747,{3.868,3.807,3.837,3.838,3.806,3.854,3.884,3.853}} Now finally to validate the fact that in spite of all these parallel commands only one single kernel is getting used by Mathematica we map our function over a list of eight threes b={3,3,3,3,3,3,3,3} and get the total time for the repetitive computation. Validation of the claim: In[16]:= Map[fun[#]&,b]//Timing Out[16]= {30.748,{3.854,3.822,3.853,3.838,3.837,3.822,3.869,3.853}} This shows that parallel commands used in the above codes had been simply useless. I will highly appreciate if any of you guys can shade some light on this problem. It is very basic in nature but the idea involved is quite central in parallel computing. What I expect is that a neat and clean Mathematica code can be written for this problem that will bring the computation time to somewhere around 6-8 seconds in place of 30-31 seconds as we have seen above. I will continue trying on the problem but in the mean time if any of you want to give it a try. With best regards to all. Pratip Chakraborty
From: David Bailey on 27 Sep 2009 21:38 pratip wrote: > Hi Everybody, > > Recently I was looking through many parallel computation example in > the documentation of Mathematica 7.0.1. If not very clear and adequate > those documentation looks pretty impressive at the first glance. Hence > I decided to do some Mathematica implementation of the small piece of > software named Super Pi which is very famous among the common over > clockers. It computes Pi up to a user defined decimal digits but in > parallel using all the cores of your processor. Have look > http://files.extremeoverclocking.com/file.php?f=36 > So my goal was to write a pure Mathematica code that computes Pi up to > three million decimal digits eight times in parallel using the eight > kernels available in my pc. However to compute this task once in my pc > it requires just around 3.885 seconds (with Intel Core i7 975 extreme > processor). > > fun[n_]:=Module[{a,tic,toc}, > tic=TimeUsed[]; > a=N[Pi,n*10^6]; > toc=TimeUsed[]; > toc-tic > ]; > (*For 3 million decimal digits*) > In[24]:= fun[3] > Out[24]= 3.885 > > Now let's see the parallel configuration of the PC. One can see that I > indeed have eight kernels present in the system. > > In[16]:= ParallelEvaluate[$ProcessID] > Out[16]= {6712,6636,7928,4112,7196,5832,3992,7484} > > In[17]:= ParallelEvaluate[$MachineName] > Out[17]= {flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher- > pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc} > > Now to compute the same thing eight times but in parallel I tried the > following combinations with no success at all. See yourself the > disappointing timing results. > > First: > In[2]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun[#] > &,b],Method->"CoarsestGrained"]; > toc=TimeUsed[]; > toc-tic > Out[4]= 30.935 > > Second: > In[11]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun[#= > ] > &,b],Method->"FinestGrained"]; > toc=TimeUsed[]; > toc-tic > Out[13]= 30.872 > > Third: > In[18]:= ParallelMap[fun[#] &, b] // Timing > > Out[18]= {30.81, {3.884, 3.822, 3.854, 3.853, 3.837, 3.869, 3.822, > 3.869}} > > Fourth: > In[21]:= ParallelTable[fun[3],{i,1,8}]//Timing > Out[21]= {30.747,{3.868,3.807,3.837,3.838,3.806,3.854,3.884,3.853}} > > Now finally to validate the fact that in spite of all these parallel > commands only one single kernel is getting used by Mathematica we map > our function over a list of eight threes b={3,3,3,3,3,3,3,3} and get > the total time for the repetitive computation. > > Validation of the claim: > In[16]:= Map[fun[#]&,b]//Timing > Out[16]= {30.748,{3.854,3.822,3.853,3.838,3.837,3.822,3.869,3.853}} > > This shows that parallel commands used in the above codes had been > simply useless. > > I will highly appreciate if any of you guys can shade some light on > this problem. It is very basic in nature but the idea involved is > quite central in parallel computing. What I expect is that a neat and > clean Mathematica code can be written for this problem that will bring > the computation time to somewhere around 6-8 seconds in place of 30-31 > seconds as we have seen above. I will continue trying on the problem > but in the mean time if any of you want to give it a try. > > With best regards to all. > > Pratip Chakraborty > Since nobody else has responded, I think you need to launch some kernels before running parallel tasks - but I have not really used this feature of Mathematica. LaunchKernels[] David Bailey http://www.dbaileyconsultancy.co.uk
From: Patrick Scheibe on 29 Sep 2009 07:42 Hi, your code is completely useless since I don't see why one should compute the same result eight times. But here is what you missed: fun[n_] := First(a)AbsoluteTiming@N[Pi, n*10^6] ParallelTable[fun[3], {i, 1, 4}] // AbsoluteTiming DistributeDefinitions[fun]; ParallelTable[fun[3], {i, 1, 4}] // AbsoluteTiming {29.608226, {6.893410, 6.849890, 6.845198, 6.848202}} {10.246625, {9.339382, 10.221913, 9.790986, 9.587946}} you should read ParallelTools/tutorial/Overview first! Cheers Patrick On Wed, 2009-09-23 at 23:50 -0400, pratip wrote: > Hi Everybody, > > Recently I was looking through many parallel computation example in > the documentation of Mathematica 7.0.1. If not very clear and adequate > those documentation looks pretty impressive at the first glance. Hence > I decided to do some Mathematica implementation of the small piece of > software named Super Pi which is very famous among the common over > clockers. It computes Pi up to a user defined decimal digits but in > parallel using all the cores of your processor. Have look > http://files.extremeoverclocking.com/file.php?f=36 > So my goal was to write a pure Mathematica code that computes Pi up to > three million decimal digits eight times in parallel using the eight > kernels available in my pc. However to compute this task once in my pc > it requires just around 3.885 seconds (with Intel Core i7 975 extreme > processor). > > fun[n_]:=Module[{a,tic,toc}, > tic=TimeUsed[]; > a=N[Pi,n*10^6]; > toc=TimeUsed[]; > toc-tic > ]; > (*For 3 million decimal digits*) > In[24]:= fun[3] > Out[24]= 3.885 > > Now let's see the parallel configuration of the PC. One can see that I > indeed have eight kernels present in the system. > > In[16]:= ParallelEvaluate[$ProcessID] > Out[16]= {6712,6636,7928,4112,7196,5832,3992,7484} > > In[17]:= ParallelEvaluate[$MachineName] > Out[17]= {flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher- > pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc} > > Now to compute the same thing eight times but in parallel I tried the > following combinations with no success at all. See yourself the > disappointing timing results. > > First: > In[2]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun[#] > &,b],Method->"CoarsestGrained"]; > toc=TimeUsed[]; > toc-tic > Out[4]= 30.935 > > Second: > In[11]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun[#= > ] > &,b],Method->"FinestGrained"]; > toc=TimeUsed[]; > toc-tic > Out[13]= 30.872 > > Third: > In[18]:= ParallelMap[fun[#] &, b] // Timing > > Out[18]= {30.81, {3.884, 3.822, 3.854, 3.853, 3.837, 3.869, 3.822, > 3.869}} > > Fourth: > In[21]:= ParallelTable[fun[3],{i,1,8}]//Timing > Out[21]= {30.747,{3.868,3.807,3.837,3.838,3.806,3.854,3.884,3.853}} > > Now finally to validate the fact that in spite of all these parallel > commands only one single kernel is getting used by Mathematica we map > our function over a list of eight threes b={3,3,3,3,3,3,3,3} and get > the total time for the repetitive computation. > > Validation of the claim: > In[16]:= Map[fun[#]&,b]//Timing > Out[16]= {30.748,{3.854,3.822,3.853,3.838,3.837,3.822,3.869,3.853}} > > This shows that parallel commands used in the above codes had been > simply useless. > > I will highly appreciate if any of you guys can shade some light on > this problem. It is very basic in nature but the idea involved is > quite central in parallel computing. What I expect is that a neat and > clean Mathematica code can be written for this problem that will bring > the computation time to somewhere around 6-8 seconds in place of 30-31 > seconds as we have seen above. I will continue trying on the problem > but in the mean time if any of you want to give it a try. > > With best regards to all. > > Pratip Chakraborty >
From: sakra on 29 Sep 2009 07:43 On Sep 24, 5:50 am, pratip <pratip.chakrabo...(a)gmail.com> wrote: > Hi Everybody, > > Recently I was looking through many parallel computation example in > the documentation of Mathematica 7.0.1. If not very clear and adequate > those documentation looks pretty impressive at the first glance. Hence > I decided to do some Mathematica implementation of the small piece of > software named Super Pi which is very famous among the common over > clockers. It computes Pi up to a user defined decimal digits but in > parallel using all the cores of your processor. Have lookhttp://files.ext= remeoverclocking.com/file.php?f=36 > So my goal was to write a pure Mathematica code that computes Pi up to > three million decimal digits eight times in parallel using the eight > kernels available in my pc. However to compute this task once in my pc > it requires just around 3.885 seconds (with Intel Core i7 975 extreme > processor). > > fun[n_]:=Module[{a,tic,toc}, > tic=TimeUsed[]; > a=N[Pi,n*10^6]; > toc=TimeUsed[]; > toc-tic > ]; > (*For 3 million decimal digits*) > In[24]:= fun[3] > Out[24]= 3.885 > > Now let's see the parallel configuration of the PC. One can see that I > indeed have eight kernels present in the system. > > In[16]:= ParallelEvaluate[$ProcessID] > Out[16]= {6712,6636,7928,4112,7196,5832,3992,7484} > > In[17]:= ParallelEvaluate[$MachineName] > Out[17]= {flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher- > pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc} > > Now to compute the same thing eight times but in parallel I tried the > following combinations with no success at all. See yourself the > disappointing timing results. > > First: > In[2]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun[= #] > &,b],Method->"CoarsestGrained"]; > toc=TimeUsed[]; > toc-tic > Out[4]= 30.935 > > Second: > In[11]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun= [#= > ] > &,b],Method->"FinestGrained"]; > toc=TimeUsed[]; > toc-tic > Out[13]= 30.872 > > Third: > In[18]:= ParallelMap[fun[#] &, b] // Timing > > Out[18]= {30.81, {3.884, 3.822, 3.854, 3.853, 3.837, 3.869, 3.822, > 3.869}} > > Fourth: > In[21]:= ParallelTable[fun[3],{i,1,8}]//Timing > Out[21]= {30.747,{3.868,3.807,3.837,3.838,3.806,3.854,3.884,3.853}} > > Now finally to validate the fact that in spite of all these parallel > commands only one single kernel is getting used by Mathematica we map > our function over a list of eight threes b={3,3,3,3,3,3,3,3} and get > the total time for the repetitive computation. > > Validation of the claim: > In[16]:= Map[fun[#]&,b]//Timing > Out[16]= {30.748,{3.854,3.822,3.853,3.838,3.837,3.822,3.869,3.853}} > > This shows that parallel commands used in the above codes had been > simply useless. > > I will highly appreciate if any of you guys can shade some light on > this problem. It is very basic in nature but the idea involved is > quite central in parallel computing. What I expect is that a neat and > clean Mathematica code can be written for this problem that will bring > the computation time to somewhere around 6-8 seconds in place of 30-31 > seconds as we have seen above. I will continue trying on the problem > but in the mean time if any of you want to give it a try. > > With best regards to all. > > Pratip Chakraborty Before running any parallel computation you have to make the definition of the function fun available on the compute kernels by entering: DistributeDefinitions[fun] Symbols defined in the controller kernel do not become available automatically on the compute kernels. Unless the definition of the function fun is available, a compute kernel cannot reduce an expression involving the symbol fun. The expression will thus be reduced on the controller kernel instead. This explains why only a single kernel (the controller kernel) is actually used in your tests. Sascha
From: Vince on 29 Sep 2009 07:45 On Sep 23, 11:50 pm, pratip <pratip.chakrabo...(a)gmail.com> wrote: > Hi Everybody, > > Recently I was looking through many parallel computation example in > the documentation of Mathematica 7.0.1. If not very clear and adequate > those documentation looks pretty impressive at the first glance. Hence > I decided to do some Mathematica implementation of the small piece of > software named Super Pi which is very famous among the common over > clockers. It computes Pi up to a user defined decimal digits but in > parallel using all the cores of your processor. Have lookhttp://files.ext= remeoverclocking.com/file.php?f=36 > So my goal was to write a pure Mathematica code that computes Pi up to > three million decimal digits eight times in parallel using the eight > kernels available in my pc. However to compute this task once in my pc > it requires just around 3.885 seconds (with Intel Core i7 975 extreme > processor). > > fun[n_]:=Module[{a,tic,toc}, > tic=TimeUsed[]; > a=N[Pi,n*10^6]; > toc=TimeUsed[]; > toc-tic > ]; > (*For 3 million decimal digits*) > In[24]:= fun[3] > Out[24]= 3.885 > > Now let's see the parallel configuration of the PC. One can see that I > indeed have eight kernels present in the system. > > In[16]:= ParallelEvaluate[$ProcessID] > Out[16]= {6712,6636,7928,4112,7196,5832,3992,7484} > > In[17]:= ParallelEvaluate[$MachineName] > Out[17]= {flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher- > pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc,flowcrusher-pc} > > Now to compute the same thing eight times but in parallel I tried the > following combinations with no success at all. See yourself the > disappointing timing results. > > First: > In[2]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun[= #] > &,b],Method->"CoarsestGrained"]; > toc=TimeUsed[]; > toc-tic > Out[4]= 30.935 > > Second: > In[11]:= b=Table[3,{i,1,8}];tic=TimeUsed[];re=Parallelize[Map[fun= [#= > ] > &,b],Method->"FinestGrained"]; > toc=TimeUsed[]; > toc-tic > Out[13]= 30.872 > > Third: > In[18]:= ParallelMap[fun[#] &, b] // Timing > > Out[18]= {30.81, {3.884, 3.822, 3.854, 3.853, 3.837, 3.869, 3.822, > 3.869}} > > Fourth: > In[21]:= ParallelTable[fun[3],{i,1,8}]//Timing > Out[21]= {30.747,{3.868,3.807,3.837,3.838,3.806,3.854,3.884,3.853}} > > Now finally to validate the fact that in spite of all these parallel > commands only one single kernel is getting used by Mathematica we map > our function over a list of eight threes b={3,3,3,3,3,3,3,3} and get > the total time for the repetitive computation. > > Validation of the claim: > In[16]:= Map[fun[#]&,b]//Timing > Out[16]= {30.748,{3.854,3.822,3.853,3.838,3.837,3.822,3.869,3.853}} > > This shows that parallel commands used in the above codes had been > simply useless. > > I will highly appreciate if any of you guys can shade some light on > this problem. It is very basic in nature but the idea involved is > quite central in parallel computing. What I expect is that a neat and > clean Mathematica code can be written for this problem that will bring > the computation time to somewhere around 6-8 seconds in place of 30-31 > seconds as we have seen above. I will continue trying on the problem > but in the mean time if any of you want to give it a try. > > With best regards to all. > > Pratip Chakraborty Pratip, You should see a linear speedup if you precede ParallelMap with DistributeDefinitions[fun]. Worked for me, with your code. Without it, I saw the same sequential behavior as you (no time to drill into that now). Vince Virgilio
|
Next
|
Last
Pages: 1 2 Prev: Memory problems when solving equations Next: SumOfSquaresRepresentations |