sum for sequences? [Python]

Prev: Advice needed on parallel processing in python
Next: coding contest

From: Steven D'Aprano on 24 Mar 2010 17:07

On Wed, 24 Mar 2010 15:29:07 +0000, kj wrote:

> Is there a sequence-oriented equivalent to the sum built-in? E.g.:
>
> seq_sum(((1, 2), (5, 6))) --> (1, 2) + (5, 6) --> (1, 2, 5, 6)
>
> ?

Yes, sum.

help(sum) is your friend.

>>> a = range(2)
>>> b = range(3)
>>> c = range(4)
>>> sum((a, b, c), [])
[0, 1, 0, 1, 2, 0, 1, 2, 3]

Beware though that sum on lists and tuples will be fairly inefficient if
you have lots of them. You may find that this will be much more efficient:

result = []
for seq in sequences:
result.extend(seq)

--
Steven

From: Paul Rubin on 24 Mar 2010 19:19

kj <no.email(a)please.post> writes:
> Is there a sequence-oriented equivalent to the sum built-in? E.g.:
> seq_sum(((1, 2), (5, 6))) --> (1, 2) + (5, 6) --> (1, 2, 5, 6)

use itertools.chain for this. A few people have mentioned that sum will
also work, but I think for that purpose it could have O(n**2)
complexity.

From: TomF on 25 Mar 2010 02:50

On 2010-03-24 14:07:24 -0700, Steven D'Aprano
<steven(a)REMOVE.THIS.cybersource.com.au> said:
> On Wed, 24 Mar 2010 15:29:07 +0000, kj wrote:
>
>> Is there a sequence-oriented equivalent to the sum built-in? E.g.:
>>
>> seq_sum(((1, 2), (5, 6))) --> (1, 2) + (5, 6) --> (1, 2, 5, 6)
>>
>> ?
>
> Yes, sum.
>
> help(sum) is your friend.

You might not want to be so glib. The sum doc sure doesn't sound like
it should work on lists.

Returns the sum of a sequence of numbers (NOT strings) plus the value
of parameter 'start' (which defaults to 0).

-Tom

From: Steven D'Aprano on 25 Mar 2010 03:34

On Wed, 24 Mar 2010 23:50:23 -0700, TomF wrote:

> On 2010-03-24 14:07:24 -0700, Steven D'Aprano
> <steven(a)REMOVE.THIS.cybersource.com.au> said:
>> On Wed, 24 Mar 2010 15:29:07 +0000, kj wrote:
>>
>>> Is there a sequence-oriented equivalent to the sum built-in? E.g.:
>>>
>>> seq_sum(((1, 2), (5, 6))) --> (1, 2) + (5, 6) --> (1, 2, 5, 6)
>>>
>>> ?
>>
>> Yes, sum.
>>
>> help(sum) is your friend.
>
> You might not want to be so glib. The sum doc sure doesn't sound like
> it should work on lists.
>
> Returns the sum of a sequence of numbers (NOT strings) plus the
> value of parameter 'start' (which defaults to 0).

What part of that suggested to you that sum might not be polymorphic?
Sure, it says numbers (which should be changed, in my opinion), but it
doesn't specify what sort of numbers -- ints, floats, or custom types
that have an __add__ method. It also singles out strings as excluded. Why
would you need to explicitly exclude strings, since they're not numbers,
if sum *only* works with numbers?

E.g. help(math.sin) could have said this, but doesn't:

Return the sine of x (NOT a dictionary)

It doesn't need to, because dicts aren't exceptional: sin doesn't work on
anything *but* numbers. There's no __sin__ method to call on arbitrary
types.

The fact that sum does single out strings is a clear sign that strings
are treated as exceptional and suggests strongly that summing arbitrary
types should work. I'm not saying that help(sum) explicitly states that
it works with lists (it clearly doesn't), but it does suggest the
possibility and makes the experiment worth trying.

I'll also note that the Fine Manual makes it even more clear that sum is
polymorphic:

http://docs.python.org/library/functions.html#sum

--
Steven

From: Steve Howell on 26 Mar 2010 10:31

On Mar 24, 4:19 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote:
> kj <no.em...(a)please.post> writes:
> > Is there a sequence-oriented equivalent to the sum built-in? E.g.:
> > seq_sum(((1, 2), (5, 6))) --> (1, 2) + (5, 6) --> (1, 2, 5, 6)
>
> use itertools.chain for this. A few people have mentioned that sum will
> also work, but I think for that purpose it could have O(n**2)
> complexity.

I agree on the practical matter that itertools.chain and other
solutions are usually the way to go for most tasks that involve
iterating through several lists.

From a purely academic standpoint, I'm not convinced that sum() is
inefficient in terms of big-O complexity, though.

showell(a)showell-laptop:~$ python
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3] on linux2
>>> class StupidList:
... def __init__(self, lst):
... print 'creating', lst
... self.lst = lst
... def __add__(self, other):
... self.lst += '|'
... self.lst.extend(other.lst)
... return self
...
>>> result = sum([StupidList([1, 2]), StupidList([3,4]),
StupidList([5,6])], StupidList([0]))
creating [1, 2]
creating [3, 4]
creating [5, 6]
creating [0]
>>> result.lst
[0, '|', 1, 2, '|', 3, 4, '|', 5, 6]

If I'm interpreting the above program correctly, then sum() is doing
the most efficient thing under the hood--it appears to do the
equivalent of += without creating unnecessary objects for intermediate
sums.

I think the special-case error message might be a case where
practicality simply beats out purity. It would be nice if sum() were
completely duck-typed-let-you-shoot-yourself-in-foot-if-you-know-what-
you-are-doing, but maybe this was such a pitfall at one time, that
extra safeguards were put into sum(). I wonder how severely sum(),
without the restriction, would underperform join() on modern versions
of Python, though.

>>> sum('1', '2')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]

Note that you can easily fake out sum() to get duck typing.

>>> class EmptyStringStarter:
... def __add__(self, other): return other
...
>>> empty = EmptyStringStarter()
>>> sum(['hello ', 'world'], empty)
'hello world'

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: Advice needed on parallel processing in python
Next: coding contest