From: Steve Howell on 26 Mar 2010 10:56 On Mar 26, 7:31 am, Steve Howell <showel...(a)yahoo.com> wrote: > On Mar 24, 4:19 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote: > > > kj <no.em...(a)please.post> writes: > > > Is there a sequence-oriented equivalent to the sum built-in? E.g.: > > > seq_sum(((1, 2), (5, 6))) --> (1, 2) + (5, 6) --> (1, 2, 5, 6) > > > use itertools.chain for this. A few people have mentioned that sum will > > also work, but I think for that purpose it could have O(n**2) > > complexity. > > I agree on the practical matter that itertools.chain and other > solutions are usually the way to go for most tasks that involve > iterating through several lists. > > From a purely academic standpoint, I'm not convinced that sum() is > inefficient in terms of big-O complexity, though. > > showell(a)showell-laptop:~$ python > Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) > [GCC 4.3.3] on linux2 > >>> class StupidList: > ... def __init__(self, lst): > ... print 'creating', lst > ... self.lst = lst > ... def __add__(self, other): > ... self.lst += '|' > ... self.lst.extend(other.lst) > ... return self > ... > >>> result = sum([StupidList([1, 2]), StupidList([3,4]), > StupidList([5,6])], StupidList([0])) > creating [1, 2] > creating [3, 4] > creating [5, 6] > creating [0] > >>> result.lst > [0, '|', 1, 2, '|', 3, 4, '|', 5, 6] > > If I'm interpreting the above program correctly, then sum() is doing > the most efficient thing under the hood--it appears to do the > equivalent of += without creating unnecessary objects for intermediate > sums. > > I think the special-case error message might be a case where > practicality simply beats out purity. It would be nice if sum() were > completely duck-typed-let-you-shoot-yourself-in-foot-if-you-know-what- > you-are-doing, but maybe this was such a pitfall at one time, that > extra safeguards were put into sum(). I wonder how severely sum(), > without the restriction, would underperform join() on modern versions > of Python, though. > > >>> sum('1', '2') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > TypeError: sum() can't sum strings [use ''.join(seq) instead] > > Note that you can easily fake out sum() to get duck typing. > > >>> class EmptyStringStarter: > ... def __add__(self, other): return other > ... > >>> empty = EmptyStringStarter() > >>> sum(['hello ', 'world'], empty) > 'hello world' Looking at the code answers my own questions: http://svn.python.org/view/python/trunk/Python/bltinmodule.c?view=markup Look for builtin_sum(). First you see the guard against strings: /* reject string values for 'start' parameter */ if (PyObject_TypeCheck(result, &PyBaseString_Type)) { PyErr_SetString(PyExc_TypeError, "sum() can't sum strings [use ''.join(seq) instead]"); Py_DECREF(iter); return NULL; } Py_INCREF(result); Also, Paul's suspicion that sum() works in O(N squared) for lists is confirmed by this comment: /* It's tempting to use PyNumber_InPlaceAdd instead of PyNumber_Add here, to avoid quadratic running time when doing 'sum(list_of_lists, [])'. However, this would produce a change in behaviour: a snippet like empty = [] sum([[x] for x in range(10)], empty) would change the value of empty. */ It's interesting, though, that you can construct classes pretty easily, as I did above, that avoid the quadratic behavior, as long as you do not mind mutating the start value, which I think is usually fine, since the original start value usually is not useful afterward anyway.
From: Steven D'Aprano on 27 Mar 2010 06:18 On Fri, 26 Mar 2010 07:31:17 -0700, Steve Howell wrote: > From a purely academic standpoint, I'm not convinced that sum() is > inefficient in terms of big-O complexity, though. > > showell(a)showell-laptop:~$ python > Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on > linux2 > >>> class StupidList: [...] But it's not *sum* that is inefficient, it is sum *with a particular data structure*. Sure, if you create your own data structure, you can make it as efficient as you like. Obviously the sum algorithm itself has to perform one addition per item, or O(N), which scales tolerably well. But each addition has a cost. If the cost is constant, then sum() as a whole remains O(N). But if the cost of addition varies with N, sum() degrades badly. We can compare the performance of sum with different data structures, starting with plain integers versus long integers: >>> from timeit import Timer >>> setup = 'data = [%d]*%d' >>> for i in range(6): .... t1 = Timer('sum(data, 0)', setup % (1, 10**i)) .... t2 = Timer('sum(data, 0)', setup % (10**50, 10**i)) .... print min(t1.repeat(number=1000)), .... print min(t2.repeat(number=1000)) .... 0.00179290771484 0.00263810157776 0.00340414047241 0.00854396820068 0.0190401077271 0.0502791404724 0.155302047729 0.645124912262 0.794432878494 2.55748295784 7.97877693176 25.3812758923 Both scale about as well, but the cost varies significantly: arithmetic on very large longints is expensive. Strings, with a trick to fool sum into accepting them, versus lists. Note that I changed the number of iterations from 6 down to 5. The reason why will become obvious: >>> class EmptyStringStarter: .... def __add__(self, other): .... return other .... >>> empty = EmptyStringStarter() >>> setup = """from __main__ import empty; data = [%r]*%d""" >>> >>> for i in range(5): .... t1 = Timer('sum(data, empty)', setup % ('a', 10**i)) .... t2 = Timer('sum(data, [])', setup % ([1], 10**i)) .... print min(t1.repeat(number=1000)), .... print min(t2.repeat(number=1000)) .... 0.00849103927612 0.00226998329163 0.0121459960938 0.0082700252533 0.0489149093628 0.186735153198 0.428920030594 5.28623914719 14.6552250385 589.102822065 Strings perform tolerably well, up to a point, but lists perform terribly. And in fact, the relatively good performance of strings is an artifact of recent versions of CPython. In Jython and IronPython, and older versions of CPython, it will behave as poorly as lists. > I wonder how severely sum(), without > the restriction, would underperform join() on modern versions of Python, > though. Take note that, in order to get an answer in reasonable time, I've reduced the number of timing iterations drastically: >>> for i in range(6): .... t1 = Timer('sum(data, empty)', setup % ('a', 10**i)) .... t2 = Timer('"".join(data)', setup % ('a', 10**i)) .... print min(t1.repeat(number=10)), .... print min(t2.repeat(number=10)) .... 8.89301300049e-05 1.09672546387e-05 0.000131845474243 2.19345092773e-05 0.000591993331909 9.29832458496e-05 0.0101289749146 0.00082802772522 0.365957021713 0.00884819030762 24.2072279453 0.0421011447906 Note the performance degradation of sum. It gets worse. Much worse: >>> for i in range(4, 7): .... t1 = Timer('sum(data, empty)', setup % ('a', 10**i)) .... t2 = Timer('"".join(data)', setup % ('a', 10**i)) .... print min(t1.repeat(number=1)), # note fewer iterations .... print min(t2.repeat(number=1)) .... 0.031229019165 0.000817060470581 2.45445990562 0.00365781784058 1024.79705095 0.0398509502411 This is absolutely catastrophic performance degradation. -- Steven
From: Steve Howell on 28 Mar 2010 10:16 On Mar 27, 3:18 am, Steven D'Aprano <st...(a)REMOVE-THIS- cybersource.com.au> wrote: > On Fri, 26 Mar 2010 07:31:17 -0700, Steve Howell wrote: > > From a purely academic standpoint, I'm not convinced that sum() is > > inefficient in terms of big-O complexity, though. > > > showell(a)showell-laptop:~$ python > > Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on > > linux2 > > >>> class StupidList: > > [...] > > But it's not *sum* that is inefficient, it is sum *with a particular data > structure*. > Yep, the implied context was with particular data structures. > Sure, if you create your own data structure, you can make it as efficient > as you like. Obviously the sum algorithm itself has to perform one > addition per item, or O(N), which scales tolerably well. But each > addition has a cost. If the cost is constant, then sum() as a whole > remains O(N). But if the cost of addition varies with N, sum() degrades > ba > The surprising part of sum() is not that the outer loop to do the sums is O(N). It is hard to imagine any other implementation (without parallelizing it). The mildly surprising part of sum() is that is does add vs. add-in- place, which leads to O(N) vs. O(1) for the inner loop calls, for certain data structures, notably lists, even though none of the intermediate results get used by the caller. For lists, you could make a more efficient variant of sum() that clones the start value and does add-in-place. I could guess pretty confidently that the reason this optimization was never tried is that sum() has always been intended to be used on numerics, since other alternatives exist for strings (join), lists (chain), and hand-coded data classes that support add-in-place (roll- your-own loop). The documentation is pretty clear on the intention that sum() is intended for numbers: http://docs.python.org/library/functions.html#sum Except for strings, the docs are not explicit about efficiency concerns for other data structures, or the fact that the reference implementation does add vs. add-in-place under the hood. http://docs.python.org/library/functions.html#sum
From: Steven D'Aprano on 28 Mar 2010 10:57 On Sun, 28 Mar 2010 07:16:10 -0700, Steve Howell wrote: > The mildly surprising part of sum() is that is does add vs. add-in- > place, which leads to O(N) vs. O(1) for the inner loop calls, for > certain data structures, notably lists, even though none of the > intermediate results get used by the caller. For lists, you could make > a more efficient variant of sum() that clones the start value and does > add-in-place. I have no doubt that you could make a version of sum for lists which is more efficient than the existing one. After all, join more or less does the same sort of thing, and it's very efficient. But don't think that add- in-place is necessarily cheap. List appends are amortized O(1) each; if you are adding M lists of N items each, that gives you O(M*N). It's possible to improve the performance a tad if you can make multiple appends in roughly constant time, which is what list.extend (probably?) does, but only up to a point. Lists are over-allocated, but if you try to add more items than there is room for, you need to make the list bigger, and that means reallocating memory, which could easily be O(N**2) or worse, depending on how good your OS's memory management is. Under Linux, at least by default, malloc will never fail, but there's no guarantee how long it will take to return. If the OOM killer has to start shutting down other applications, and paging more and more memory to disk, eventually malloc will return (or the system will core dump), but it could take a while... -- Steven
From: Duncan Booth on 28 Mar 2010 11:17
Steve Howell <showell30(a)yahoo.com> wrote: > The mildly surprising part of sum() is that is does add vs. add-in- > place, which leads to O(N) vs. O(1) for the inner loop calls, for > certain data structures, notably lists, even though none of the > intermediate results get used by the caller. For lists, you could > make a more efficient variant of sum() that clones the start value and > does add-in-place. Doing add-in-place isn't the only way to make sum more efficient: if you assume that addition is associative (which of course the builtin sum can't) then you can form partial sums. e.g. instead of calculating: (((((((a + b) + c) + d) + e) + f) + g) + h) you calculate: (((a + b) + (c + d)) + ((e + f) + (g + h))) Obviously this requires more space than the naive sum, but not as much as you might at first expect: you only need to hold log(n) intermediates values at any time. > I could guess pretty confidently that the reason this optimization was > never tried is that sum() has always been intended to be used on > numerics, since other alternatives exist for strings (join), lists > (chain), and hand-coded data classes that support add-in-place (roll- > your-own loop). Doing it this way helps summing lists or strings (though not as much as str.join), but it also helps if you need to sum a long list of similarly sized floats as you'll get a more accurate answer. See http://groups.google.com/group/comp.lang.python/browse_thread/thread/9eda29 faf92f532e/027cef7d4429aa3a for an earlier discussion of this, or just Google comp.lang.python for 'pairwise sum'. Here's the code I posted in that thread: def sumpairs(seq): tmp = [] for i,v in enumerate(seq): if i&1: tmp[-1] = tmp[-1] + v i = i + 1 n = i & -i while n > 2: t = tmp.pop(-1) tmp[-1] = tmp[-1] + t n >>= 1 else: tmp.append(v) while len(tmp) > 1: t = tmp.pop(-1) tmp[-1] = tmp[-1] + t return tmp[0] and I claimed that my Python coded sumpairs function was faster than the builtin sum on a list of lists once you had more than about 210 items. I never did get round to rewriting it in C for a more realistic speed comparison: summing integers my Python version is about 60 times slower than the builtin. |