Prev: GAE + recursion limit
Next: Pool Module: iterator does not yield consistently with differentchunksizes
From: syockit on 2 Jul 2010 04:44 I've been playing around with custom iterators to map into Pool. When I run the code below: def arif(arr): return arr def permutate(n): k = 0 a = list(range(6)) while k<n: for i in range(6): a.insert(0, a.pop(5)+6) #yield a[:] <-- produces correct results yield a k += 1 return def main(): from multiprocessing import Pool pool = Pool() chksize = 15 for x in pool.imap_unordered(arif, permutate(100), chksize): print(x) if __name__=="__main__": main() ..... will output something like this: [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [36, 37, 38, 39, 40, 41] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [72, 73, 74, 75, 76, 77] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [108, 109, 110, 111, 112, 113] [144, 145, 146, 147, 148, 149] .... where results are duplicated number of times equal to chunk size, and the results between the gap are lost. Using a[:] instead, i get: [6, 7, 8, 9, 10, 11] [12, 13, 14, 15, 16, 17] [18, 19, 20, 21, 22, 23] [24, 25, 26, 27, 28, 29] [30, 31, 32, 33, 34, 35] [36, 37, 38, 39, 40, 41] [42, 43, 44, 45, 46, 47] [48, 49, 50, 51, 52, 53] ..... it comes out okay. Any explanation for such behavior? Ahmad Syukri
From: Peter Otten on 2 Jul 2010 06:22
syockit wrote: > I've been playing around with custom iterators to map into Pool. When > I run the code below: > > def arif(arr): > return arr > > def permutate(n): > k = 0 > a = list(range(6)) > while k<n: > for i in range(6): > a.insert(0, a.pop(5)+6) > #yield a[:] <-- produces correct results > yield a > k += 1 > return > > def main(): > from multiprocessing import Pool > pool = Pool() > chksize = 15 > for x in pool.imap_unordered(arif, permutate(100), chksize): > print(x) > > if __name__=="__main__": > main() > > .... will output something like this: > > > [36, 37, 38, 39, 40, 41] > [36, 37, 38, 39, 40, 41] > [36, 37, 38, 39, 40, 41] > [36, 37, 38, 39, 40, 41] > [36, 37, 38, 39, 40, 41] > [36, 37, 38, 39, 40, 41] > [72, 73, 74, 75, 76, 77] > [72, 73, 74, 75, 76, 77] > [72, 73, 74, 75, 76, 77] > [72, 73, 74, 75, 76, 77] > [72, 73, 74, 75, 76, 77] > [72, 73, 74, 75, 76, 77] > [108, 109, 110, 111, 112, 113] > [108, 109, 110, 111, 112, 113] > [108, 109, 110, 111, 112, 113] > [108, 109, 110, 111, 112, 113] > [108, 109, 110, 111, 112, 113] > [108, 109, 110, 111, 112, 113] > [144, 145, 146, 147, 148, 149] > > ... where results are duplicated number of times equal to chunk size, > and the results between the gap are lost. Using a[:] instead, i get: > > [6, 7, 8, 9, 10, 11] > [12, 13, 14, 15, 16, 17] > [18, 19, 20, 21, 22, 23] > [24, 25, 26, 27, 28, 29] > [30, 31, 32, 33, 34, 35] > [36, 37, 38, 39, 40, 41] > [42, 43, 44, 45, 46, 47] > [48, 49, 50, 51, 52, 53] > > .... it comes out okay. Any explanation for such behavior? > > Ahmad Syukri Python passes references araound, not copies. Consider it = permutate(100) chunksize = 15 from itertools import islice while True: chunk = tuple(islice(it, chunksize)) if not chunk: break # dispatch items in chunk print chunk chunksize items are calculated before they are dispatched. When you yield the same list every time in permutate() previous items in the chunk will see any changes you make on the list with the intention to update it to the next value. Peter |