list.pop(0) vs. collections.dequeue [Python]

Prev: distutils not finding all of my pure python modules
Next: iterating lists

From: Daniel Stutzbach on 24 Jan 2010 16:51

On Sun, Jan 24, 2010 at 1:53 PM, Steve Howell <showell30(a)yahoo.com> wrote:
> I don't think anybody provided an actual link, but please correct me
> if I overlooked it.

I have to wonder if my messages are all ending up in your spam folder
for some reason. :-)

PEP 3128 (which solves your problem, but not using the implementation
you suggest)
http://www.python.org/dev/peps/pep-3128/

Implementation as an extension module:
http://pypi.python.org/pypi/blist/

Related discussion:
http://mail.python.org/pipermail/python-3000/2007-April/006757.html
http://mail.python.org/pipermail/python-3000/2007-May/007491.html

Detailed performance comparison:
http://stutzbachenterprises.com/performance-blist

I maintain a private fork of Python 3 with the blist replacing the
regular list, as a way of rigorously testing the blist implementation.

Although I originally proposed a PEP, I am content to have the blist
exist as a third-party module.

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC

From: Steve Howell on 24 Jan 2010 23:12

On Jan 24, 12:44 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote:
> Steve Howell <showel...(a)yahoo.com> writes:
> > Proposal: Improve list's implementation so that deleting elements from
> > the front of the list does not require an O(N) memmove operation. ...
> > It is possible now, of course, to use a data structure in
> > Python that has O(1) for deleting off the top of the list, but none of
> > the alternatives fully replicate the benefits of list itself.
>
> I think you are mostly referring to deque. Why don't you instead
> say what you think is wrong with using deque, and how deque can
> be improved?
>

There is nothing wrong with deque, at least as far as I know, if the
data strucure actually applies to your use case. It does not apply to
my use case.

> > See above. An implementation of this PEP would not change the
> > definition of the language in any way, but it would have to minimally
> > impact the performance of lists for the normal use cases.
>
> But you're talking about adding one or two words to EVERY list, and many
> normal use cases allocate a LOT of lists. Those use cases are likely
> more common than use cases that pop from the front of the list but for
> some reason can't use deque.

For EVERY list, you are not only allocating memory for the list
itself, but you are also allocating memory for the objects within the
list. So the extra one or two words are NEGLIGIBLE.

> > The most ambitious proposal is to fix the memory manager itself to
> > allow the release of memory from the start of the chunk.
>
> That's inappropriate given the memory fragmentation it would cause.
>

Bullshit. Memory managers consolidate free memory chunks all the
time. That's their job.

> Really, you're describing a problem that arises in a few programs but up
> til now, as far as I know, everyone has found deque to be an adequate
> solution.

Wrong. Many programs delete the first element of the list.

> Your approach of snarling against list is not persuading
> anyone that list needs to be changed, because most everyone is satisfied
> with the existing solution.

Please provide evidence of that. I am pretty sure that everybody who
chooses alternatives to Python would disagree.

> You might change approaches and discuss
> deque, what's wrong with it, and whether it can be fixed. Getting a
> change approved for deque is probably much easier than getting one
> approved for list, just because nowhere near as many things depend on
> deque's performance.

Again...I am not looking to improve deque, which is a perfectly valid
data structure for a limited set of problems.

> And when discussing performance in this contextc
> additive constants do matter.

Wrong again. Operations that mutate lists are already expensive, and
a few checks to see if unreleased memory can be reclaimed are totally
NEGLIGIBLE.

From: Steven D'Aprano on 25 Jan 2010 01:07

On Sun, 24 Jan 2010 20:12:11 -0800, Steve Howell wrote:

>> > The most ambitious proposal is to fix the memory manager itself to
>> > allow the release of memory from the start of the chunk.
>>
>> That's inappropriate given the memory fragmentation it would cause.
>>
>>
> Bullshit. Memory managers consolidate free memory chunks all the time.
> That's their job.

So let me get this straight...

You've complained that Python's list.pop(0) is lame because it moves
memory around. And your solution to that is to have the memory manager
move the memory around instead?

Perhaps I'm missing something, but I don't see the advantage here. At
best, you consolidate all those moves you wanted to avoid and do them all
at once instead of a few at a time. At worst, you get a situation where
the application periodically, and unpredictably, grinds to a halt while
the memory manager tries to defrag all those lists.

>> Really, you're describing a problem that arises in a few programs but
>> up til now, as far as I know, everyone has found deque to be an
>> adequate solution.
>
> Wrong. Many programs delete the first element of the list.

I'm sure they do. Many programs do all sorts of things, of varying
sensibleness. But I'm pretty sure I've never written a program that
deleted the first element of a list. Even if I have, it's a vanishingly
small use-case for me. YMMV.

>> Your approach of snarling against list is not persuading anyone that
>> list needs to be changed, because most everyone is satisfied with the
>> existing solution.
>
> Please provide evidence of that. I am pretty sure that everybody who
> chooses alternatives to Python would disagree.

Do you honestly believe that "everybody" who prefers another language
over Python does so because they dislike the performance of list.pop(0)?

>> You might change approaches and discuss deque, what's wrong with it,
>> and whether it can be fixed. Getting a change approved for deque is
>> probably much easier than getting one approved for list, just because
>> nowhere near as many things depend on deque's performance.
>
> Again...I am not looking to improve deque, which is a perfectly valid
> data structure for a limited set of problems.
>
>> And when discussing performance in this contextc additive constants do
>> matter.
>
> Wrong again. Operations that mutate lists are already expensive, and a
> few checks to see if unreleased memory can be reclaimed are totally
> NEGLIGIBLE.

Popping from the end of the list isn't expensive. Reversing lists is
relatively cheap. In-place modifications are very cheap.

--
Steven

From: Paul Rubin on 25 Jan 2010 02:24

Steve Howell <showell30(a)yahoo.com> writes:
> There is nothing wrong with deque, at least as far as I know, if the
> data strucure actually applies to your use case. It does not apply to
> my use case.

You haven't explained why deque doesn't apply to your use case. Until a
convincing explanation emerges, the sentiment you're creating seems to
be "what's wrong with that guy and why doesn't he just use deque?". So,
why aren't you using deque? If deque somehow isn't adequate for your
use case, maybe it can be improved.

>> Your approach of snarling against list is not persuading
>> anyone that list needs to be changed, because most everyone is satisfied
>> with the existing solution.
>
> Please provide evidence of that. I am pretty sure that everybody who
> chooses alternatives to Python would disagree.

I've heard of many reasons to choose alternatives to Python, and have
chosen alternatives to Python in various situations myself. The
list.pop(0) issue has never been one of those reasons for me or anyone
else I know of to choose an alternative until you came along. Anyway,
you're welcome to switch to another language; nobody's heart will be
broken if you do that. I'd be interested to know which languages handle
list.pop(0) the way you're proposing for Python.

>> And when discussing performance in this context, additive constants
>> do matter.

> Wrong again. Operations that mutate lists are already expensive

I'm talking about memory consumption, which is part of Python's concept
of performance. You're proposing adding a word or two to every list,
with insufficient justification presented so far. Any such
justification would have to include a clear and detailed explanation of
why using deque is insufficient, so that would be a good place to start.

On another note: the idea you're suggesting, while not yet 100%
convincing, is not crazy, which is why people are willing to discuss it
with you reasonably. But your confrontational style is making
discussion unpleasant. Can you dial it back a little? Your current
approach is perhaps leading you towards people's ignore lists.

From: Antoine Pitrou on 25 Jan 2010 06:29

Le Sun, 24 Jan 2010 11:28:53 -0800, Aahz a écrit :
>
> Again, your responsibility is to provide a patch and a spectrum of
> benchmarking tests to prove it. Then you would still have to deal with
> the objection that extensions use the list internals -- that might be an
> okay sell given the effort otherwise required to port extensions to
> Python 3, but that's not the way to bet.

IMO, code accessing the list internals should be considered broken. The
macros (PyList_GET_ITEM, etc.) are there for a reason. We can't just
freeze every internal characteristic of the interpreter just because
someone might be messing around with it in unrecommended ways.

First | Prev | Next | Last
Pages: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Prev: distutils not finding all of my pure python modules
Next: iterating lists