Prev: streaming in D2Q7 LB model
Next: shape detection
From: ibisek on 7 Oct 2009 05:49 Edric, in your code to reach the functionality I need I would have to call the build within the loop, like this: spmd x = codistributed.zeros( 100 ); % 100x100 array spread across the workers lp = getLocalPart( x ); % The underlying "local part" rr = globalIndices( x, 1 ); % Which rows in the global array do we have? cc = globalIndices( x, 2 ); % Which columns? for ii = 1:length( rr ) for jj = 1:length( cc ) % perform some calculation on the local part lp(ii, jj) = labindex * (rr(ii) + cc(jj)); % We've modified the local part of "x", we need to put it back together x = codistributed.build( lp, getCodistributor( x ) ); end end end but that throws an error communication mismatch error was encountered: The other lab became idle during receiving from lab 3 (tag: 32442) Error stack: gop>gop/receiveWithErrorHandling at 94 gop.m at 79 isreplicated.m at 23 hBuildFromLocalPartImpl>buildWithCompleteDist at 46 hBuildFromLocalPartImpl.m at 17 build.m at 107 smp_soma_all_to_all_adaptive>(spmd body) at 117 ..... Placing it anywhere else then you suggested always results into this error. Until calling the xd = gather( x ); I cannot see the entire current data, so I cannot use this approach... I guess I'll have to try the labSend... will let you know how it ended up. To Bruno: Your InplaceArray library is my next move. I hope it is not going to copy the object behind pointers :) Thanks guys so far, I'll let you know how it goes :)
From: Edric M Ellis on 7 Oct 2009 06:31 "ibisek " <cervenka(a)fai.utb.cz> writes: > in your code to reach the functionality I need I would have to call the build > within the loop, like this: > > spmd > x = codistributed.zeros( 100 ); % 100x100 array spread across the workers > lp = getLocalPart( x ); % The underlying "local part" > rr = globalIndices( x, 1 ); % Which rows in the global array do we have? > cc = globalIndices( x, 2 ); % Which columns? > for ii = 1:length( rr ) > for jj = 1:length( cc ) > % perform some calculation on the local part > lp(ii, jj) = labindex * (rr(ii) + cc(jj)); > % We've modified the local part of "x", we need to put it back together > x = codistributed.build( lp, getCodistributor( x ) ); > end > end > end > > but that throws an error > > communication mismatch error was encountered: > The other lab became idle during receiving from lab 3 (tag: 32442) > > Error stack: > gop>gop/receiveWithErrorHandling at 94 > gop.m at 79 > isreplicated.m at 23 > hBuildFromLocalPartImpl>buildWithCompleteDist at 46 > hBuildFromLocalPartImpl.m at 17 > build.m at 107 > smp_soma_all_to_all_adaptive>(spmd body) at 117 > .... > > Placing it anywhere else then you suggested always results into this error. That's because the call "codistributed.build" is a collective call (it involves communication) - all workers must ensure that they call that function at the same time. Inside the loop as you have it, sometimes some workers will call it and others will not (although you might get lucky if each worker happens to have the same number of columns of x). > Until calling the xd = gather( x ); I cannot see the entire current data, so I > cannot use this approach... I guess I'll have to try the labSend... will let > you know how it ended up. Using the low-level message-passing functions such as labSend and labReceive can be tricky, it might help if you can show the communication pattern you need between calculations. Is it the case that any worker might need a value from any other worker, or is there more structure than that? Cheers, Edric.
From: ibisek on 7 Oct 2009 07:55 Edric M Ellis wrote in message ... > Is it the case that any worker might need a value from any > other worker, or is there more structure than that? Yes, it is exactly that case. I have a matrix in which rows are 'individuals' (of an evolutionary algorithms) and columns are properties of that individuals. I migrate these individuals (rows) to ALL others (rows) (tha's why I need to have also the rest of the rows accessible) and then to modify a 'working'/current row (to store a modified individual). Therefore, the functionality I am seeking is something like let's say this in Java: private volatile float[][] myArray; public synchronized flloat[] getIndividual(int row) { return myArray[row]; } public synchronized setRow(int rowNumber, float[] row) { myArray[rowNumber] = row; } Frankly, here I would not be afraid to access the myArray array directly when I take care of not writing into the same row from different threads at the same time. Or more precisely to separate the array by rows so threads can write only into 'their' rows but can read from all, while the data will reflect the actual state of data manipulated by all threads together. In both, parfor and spmd, I cannot access the fresh data from the other workers. From my point of view spmd does exactly the same as parfor, you just see a little bit more what is happening behind the scenes. Therefore I think the InplaceArray might help - if Matlab will copy just the pointers, not the data behind. On the other hand, involving a library which is not in the standard Matlab installation might bring some more inconvenience (for users who will want to use the algorithm). And I am a little worried about the communication overhead the labSend may cause.... Will try both and will report it here soon.
From: ibisek on 7 Oct 2009 09:16 All right, here I am again. I took the InplaceArray trail first. Having this code: % matlabpool open len = 10; A = zeros(len); %just a simple multiplication table for i=1:size(A,1) for j=1:size(A,2) A(i,j)=i*j; end end %create the pointer on the array. further we'll use only the pointer p = inplacearray(A); variant = 2; switch variant case 1 parfor i=1:len p(i,2)=i; p(i,3)=p(10,10); end case 2 %assign array rows to workers: len = length(A); numWorkers = matlabpool('size'); indexes = zeros(numWorkers, 2); %from-to tasksPerWorker = round(len/numWorkers); for i=1:numWorkers indexes(i,1) = (i-1)*tasksPerWorker + 1; indexes(i,2) = indexes(i,1) + tasksPerWorker - 1; if indexes(i,2) > len indexes(i,2) = len; end end %do the parallel stuff: spmd myIndexes = indexes(labindex,:); for i=myIndexes(1):myIndexes(2) disp(i); %p(i,3)=p(10,10); end end end pp = p(:,:); releaseinplace(p) pp % matlabpool close In variant 1 I get the 'variable p is indexed in different ways' error already in the editor (line with p(i,3)=p(10,10);). Launching it results in ??? Error: The variable p in a parfor cannot be classified. See Parallel for Loops in MATLAB, "Overview". which is quite understandable. And it means, "the thing" beyond parfor attemps to slice the data even if the 'p' variable is a pointer. In the variant 2 the run results in immediate ------------------------------------------------------------------------ Segmentation violation detected at Wed Oct 07 14:59:53 2009 ------------------------------------------------------------------------ ..... even though, THERE IS NO 'p' VARIABLE INVOLVED at all. I think my desperate attempts to make it work are going to meet my resignation thoughts very soon :( Somebody has to be facing this problem before. Where are those guys!? Or perhaps I only have a bad google expression....? Well, I still have the labSend to explore....
From: Edric M Ellis on 8 Oct 2009 03:38
"ibisek " <cervenka(a)fai.utb.cz> writes: > Edric M Ellis wrote in message ... > >> Is it the case that any worker might need a value from any >> other worker, or is there more structure than that? > > Yes, it is exactly that case. I have a matrix in which rows are 'individuals' > (of an evolutionary algorithms) and columns are properties of that > individuals. I migrate these individuals (rows) to ALL others (rows) (tha's > why I need to have also the rest of the rows accessible) and then to modify a > 'working'/current row (to store a modified individual). Ok, it seems then that (co)distributed arrays might not be your best bet. You might be better off using normal arrays, and synchronising them using communication (by the way, MATLAB workers are all separate processes, so you cannot share memory between them directly without using special shared memory segments. And even then, that only works for the special case where the workers are on the same machine - in general, they might not be). Here's one approach (using "gcat", which contains calls to labSend and labReceive): % starting point calculated at the client X = zeros( 100 ); spmd % Because we're using communication, it's important that each worker does % the same number of iterations for iter = 1:100 % Lab 1 calculates all the rows to update and broadcasts that to all workers rowsToUpdate = labBroadcast( 1, randperm( 100 ) ); rowsToUpdate = rowsToUpdate( 1:numlabs ); % Each lab selects a given row to update rowToUpdate = rowsToUpdate( labindex ); % Calculation: each lab updates a distinct row. X( rowToUpdate, iter ) = labindex; % synchronise X - first use gcat to transfer updates to each % worker. Effectively, each worker sends their updated row to each % other worker. updates = gcat( X( rowToUpdate, : ), 1 ); % Then apply the updates to X X( rowsToUpdate, : ) = updates; end end % X is a Composite, but each element is identical, so we just need to fetch the % value from worker 1. X = X{1}; Cheers, Edric. |