From: Patrick Maupin on 28 Mar 2010 13:27 On Mar 28, 11:56 am, "Alf P. Steinbach" <al...(a)start.no> wrote: > From a more practical point of view, the sum efficiency could be improved by > doing the first addition using '+' and the rest using '+=', without changing the > behavior. Mod parent up!!!! :-)
From: Steve Howell on 28 Mar 2010 13:34 On Mar 28, 10:21 am, Steve Howell <showel...(a)yahoo.com> wrote: > import timeit > M = 10 > N = 8000 > > def in_place( > start = [], > sublists = ([[None] * M]) * N > ): > # only macro-optimized > i = 0 > for sublist in sublists: > if i == 0: > accum = start + sublist > i += 1 > else: > accum.extend(sublist) FYI I later obtained similar results with the more general: accum += sublist > if i == 0: > return 'whatever' # semantics here? > return accum > > def with_intermediates( > start = [], > sublists = ([[None] * M]) * N > ): > accum = start > for sublist in sublists: > accum = accum + sublist > return accum >
From: Patrick Maupin on 28 Mar 2010 14:16 On Mar 28, 12:34 pm, Steve Howell <showel...(a)yahoo.com> wrote: > FYI I later obtained similar results with the more general: > accum += sublist Yeah, but you still have to create an object of the correct type for accum. And for summing small lists, that will actually increase the runtime. (Worst case, a list of a single item: you have to create the accum object based on the type of the start value, then do two += into it, for the start value and the first item in the list, vs just doing a single + which automatically creates an object.) Personally, I think the most general approach, if sum were open to enhancements, would be to automatically apply Alf's suggestion of "+" on summing the first item to the start value, and "+=" on subsequent items. Also, you *could* add a parameter to sum(), for example, default "partialsums=False" that would allow the use of partial sums in those cases where that might give better behavior (such as floats and tuples).
From: Steve Howell on 28 Mar 2010 14:50 On Mar 28, 11:16 am, Patrick Maupin <pmau...(a)gmail.com> wrote: > On Mar 28, 12:34 pm, Steve Howell <showel...(a)yahoo.com> wrote: > > > FYI I later obtained similar results with the more general: > > accum += sublist > > Yeah, but you still have to create an object of the correct type for > accum. I think you overlooked the surrounding code. Here is the code again: def in_place( start = [], sublists = ([[None] * M]) * N ): # only macro-optimized i = 0 for sublist in sublists: if i == 0: accum = start + sublist i += 1 else: accum += sublist if i == 0: return 'whatever' # semantics here? return accum No need to infer the type. > And for summing small lists, that will actually increase the > runtime. (Worst case, a list of a single item: you have to create the > accum object based on the type of the start value, then do two += into > it, for the start value and the first item in the list, vs just doing > a single + which automatically creates an object.) > For small lists, the performance of any sane implementation would probably be so fast as to be negligible, unless you are in a really tight loop. If you are in a tight loop, then your use case probably eventually sums large lists. Even if it doesn't, the slowdown is just a constant. For M=5, I get these results on my machine: M N t1 t2 (t2/t1) 5 1 3.50475311279e-05 3.2901763916e-05 0.938775510204 5 2 2.00271606445e-05 1.59740447998e-05 0.797619047619 5 4 6.79492950439e-05 6.31809234619e-05 0.929824561404 5 8 0.000124931335449 0.000126123428345 1.00954198473 5 64 0.000530958175659 0.00226187705994 4.25999101931 5 1024 0.00262904167175 0.27246594429 103.636981953 t1 = time with add only first element t2 = time with add all elements Very small penalty for small lists, very large gain for large lists. > Personally, I think the most general approach, if sum were open to > enhancements, would be to automatically apply Alf's suggestion of "+" > on summing the first item to the start value, and "+=" on subsequent > items. See above. That's the approach I would use as well.
From: Paul Rubin on 28 Mar 2010 16:49
Steve Howell <showell30(a)yahoo.com> writes: > The documentation is pretty clear on the intention that sum() is > intended for numbers: ... I've never been big on the idea of duck-typing addition. Who would have thought that (1,2,3)+(4.5.6) was something other than the vector sum? I think itertools.chain more directly expresses what the OP was trying to do, and is preferable for readability, independently of these performance questions. |