From: Kevin Brown on 4 Aug 2010 14:46 I have a trial version of the Paralell Computing toolbox that I'm testing out. I'm using the parfor command to process some loops in parallel. The computation within each loop is relatively small (3 seconds or so), so I'm not getting the speed-up from 8 workers (one per core) that I had hoped for - in one instance, it's only 3x faster and in another, it's only 20% faster. I understand that the overhead from initializing and creating memory for the workers is probably eating into the performance gain. Is there some way to examine where that time is spent, so that I can optimize my code better? Like the profiler, but for parallel jobs? For example, if I knew that it was the memory transfer that was eating up the time, I could try to design for less memory transfer required. Thanks in advance, - Kevin
From: Edric M Ellis on 5 Aug 2010 04:16 "Kevin Brown" <kevin.m.brown(a)philips.com> writes: > I have a trial version of the Paralell Computing toolbox that I'm testing out. > I'm using the parfor command to process some loops in parallel. The computation > within each loop is relatively small (3 seconds or so), so I'm not getting the > speed-up from 8 workers (one per core) that I had hoped for - in one instance, > it's only 3x faster and in another, it's only 20% faster. > > I understand that the overhead from initializing and creating memory for the > workers is probably eating into the performance gain. Is there some way to > examine where that time is spent, so that I can optimize my code better? Like > the profiler, but for parallel jobs? For example, if I knew that it was the > memory transfer that was eating up the time, I could try to design for less > memory transfer required. > > Thanks in advance, - Kevin Unfortunately we don't currently have a good solution for profiling data transfers involved in PARFOR loops. You can track the time used on the client by running the standard profiler with the "-timer real" option to see how much time is taken in the PARFOR loops themselves. Cheers, Edric.
|
Pages: 1 Prev: Really need help to speed up piece of code Next: Problem of color representation in worldmap |