From: Steven D'Aprano on 9 May 2010 02:17 On Sat, 08 May 2010 14:06:33 -0700, Oltmans wrote: > On May 9, 1:53 am, superpollo <ute...(a)esempio.net> wrote: > >> add = lambda a,b: a+b >> for i in reduce(add,a): >> print i > > This is very neat. Thank you. Sounds like magic to me. Can you please > explain how does that work? Many thanks again. Don't use this except for small lists, it is very inefficient and will be slow for large lists. It is a Shlemiel The Painter algorithm: http://www.joelonsoftware.com/articles/fog0000000319.html The most idiomatic solution is a simple, straightforward nested iteration: for sublist in a: for item in sublist: do_something_with(item) Say that there are 10 sublists with 10 items each. Then nested iteration will iterate 100 times in total. The solution with reduce will iterate: 10+10 # add the first sublist and the second sublist 20+10 # add the third sublist 30+10 # add the fourth sublist 40+10 # and so on... 50+10 60+10 70+10 80+10 90+10 # add the last sublist 100 # and now iterate over the combined list or 640 times in total. If there are 100 sublists of 10 items each, the performance is even worse: 51,490 for the reduce solution, versus 1000 for the nested iteration. Admittedly those iterations will be in fast C code instead of slow Python code, which is why you might not notice the difference at first, but you're still doing a lot of unnecessary work which takes time. How much time? Python makes it easy to find out. >>> from timeit import Timer >>> setup = "data = [range(10) for i in range(10)]" >>> t1 = Timer("""for sublist in data: .... for item in sublist: .... pass""", setup) >>> t2 = Timer("""for item in reduce(lambda x,y: x+y, data): .... pass""", setup) >>> >>> min(t1.repeat(number=100000)) 0.94107985496520996 >>> min(t2.repeat(number=100000)) 1.7509880065917969 So for ten sublists of ten items each, the solution using reduce is nearly twice as slow as the nested iteration. If we make the number of lists ten times larger, the nested for-loop solution takes ten times longer, as you would expect: >>> setup = "data = [range(10) for i in range(100)]" >>> t1 = Timer("""for sublist in data: .... for item in sublist: .... pass""", setup) >>> min(t1.repeat(number=100000)) 10.349304914474487 But the reduce solution slows down by a factor of thirty-two rather than ten: >>> t2 = Timer("""for item in reduce(lambda x,y: x+y, data): .... pass""", setup) >>> min(t2.repeat(number=100000)) 58.116463184356689 If we were to increase the number of sublists further, the reduce solution will perform even more badly. -- Steven
From: Steven D'Aprano on 9 May 2010 05:18 On Sun, 09 May 2010 15:17:38 +1000, Lie Ryan wrote: > On 05/09/10 07:09, Günther Dietrich wrote: >> >> Why not this way? >> >>>>> a = [[1,2,3,4], [5,6,7,8]] >>>>> for i in a: >> .... for j in i: >> .... print(j) >> .... >> 1 >> 2 >> 3 >> 4 >> 5 >> 6 >> 7 >> 8 >> >> Too simple? > > IMHO that's more complex due to the nested loop, What's so complex about a nested loop? And why are you saying that it is "more complex" than the Original Poster's solution, which also had a nested loop, plus a pointless list comprehension? > though I would > personally do it as: > > a = [ [1,2,3,4], [5,6,7,8] ] > from itertools import chain > for i in chain.from_iterable(a): > print i > > so it won't choke when 'a' is an infinite stream of iterables. Neither will a nested for-loop. -- Steven
From: Lie Ryan on 9 May 2010 08:52 On 05/09/10 19:18, Steven D'Aprano wrote: > On Sun, 09 May 2010 15:17:38 +1000, Lie Ryan wrote: > >> On 05/09/10 07:09, Günther Dietrich wrote: >>> >>> Why not this way? >>> >>>>>> a = [[1,2,3,4], [5,6,7,8]] >>>>>> for i in a: >>> .... for j in i: >>> .... print(j) >>> .... >>> 1 >>> 2 >>> 3 >>> 4 >>> 5 >>> 6 >>> 7 >>> 8 >>> >>> Too simple? >> >> IMHO that's more complex due to the nested loop, > > What's so complex about a nested loop? one more nested tab. That extra whitespaces is quite irritating. And why are you saying that it is > "more complex" than the Original Poster's solution, which also had a > nested loop, plus a pointless list comprehension? You misunderstood. Tycho Anderson posted an itertools.chain(*chain) solution for which Gunther Dietrich remarked "why not a nested loop"; I am replying to Gunther Dietrich's nested loop with "because nested loop is more complex than chain()" and added that the original[Tycho Anderson's] chain solution has a subtle bug when facing infinite generator of iterables. >> though I would >> personally do it as: >> >> a = [ [1,2,3,4], [5,6,7,8] ] >> from itertools import chain >> for i in chain.from_iterable(a): >> print i >> >> so it won't choke when 'a' is an infinite stream of iterables. > > Neither will a nested for-loop.
From: Steven D'Aprano on 9 May 2010 13:58 On Sun, 09 May 2010 22:52:55 +1000, Lie Ryan wrote: >>> IMHO that's more complex due to the nested loop, >> >> What's so complex about a nested loop? > > one more nested tab. That extra whitespaces is quite irritating. Then say you don't like it, don't try to make a subjective dislike seem objectively bad with a spurious claim of complexity. There's nothing complex about an extra level of indentation. It's *one token*, with zero run-time cost and virtually no compile-time cost. -- Steven
From: Jean-Michel Pichavant on 10 May 2010 07:30
Oltmans wrote: > On May 9, 1:53 am, superpollo <ute...(a)esempio.net> wrote: > > >> add = lambda a,b: a+b >> for i in reduce(add,a): >> print i >> > > This is very neat. Thank you. Sounds like magic to me. Can you please > explain how does that work? Many thanks again. > > shorter <> nicer IMO. Those alternatives are interesting from a tech point of view, but nothing can beat the purity of a vintage 'for' loop with *meaningful names*. salads = [['apple', 'banana'], ['apple', 'lemon', 'kiwi']] ingredients = [] for salad in salads: for fruit in salad: ingredients.append(fruit) print 'Remember to buy %s' % ingredients Lame & effective (1st adjective is irrelevant outside a geek contest) JM |