list.pop(0) vs. collections.dequeue [Python]

Prev: distutils not finding all of my pure python modules
Next: iterating lists

From: Steve Howell on 25 Jan 2010 12:31

On Jan 24, 11:24 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote:
> Steve Howell <showel...(a)yahoo.com> writes:
> > There is nothing wrong with deque, at least as far as I know, if the
> > data strucure actually applies to your use case. It does not apply to
> > my use case.
>
> You haven't explained why deque doesn't apply to your use case. Until a
> convincing explanation emerges, the sentiment you're creating seems to
> be "what's wrong with that guy and why doesn't he just use deque?". So,
> why aren't you using deque? If deque somehow isn't adequate for your
> use case, maybe it can be improved.
>

These are the reasons I am not using deque:

1) I want to use native lists, so that downstream methods can use
them as lists.
2) Lists are faster for accessing elements.
3) I want to be able to insert elements into the middle of the list.
4) I have no need for rotating elements.

I also have reasons for not using the other workarounds, such as
reversing the list.

> >> And when discussing performance in this context, additive constants
> >> do matter.
> > Wrong again. Operations that mutate lists are already expensive
>
> I'm talking about memory consumption, which is part of Python's concept
> of performance. You're proposing adding a word or two to every list,
> with insufficient justification presented so far. Any such
> justification would have to include a clear and detailed explanation of
> why using deque is insufficient, so that would be a good place to start.
>

Adding a word or two to a list is an O(1) addition to a data structure
that takes O(N) memory to begin with.

That extra pointer should really be taken not just in context of the
list itself taking O(N) memory, but also the fact that all the
elements in the list are also consuming memory (until they get popped
off). So adding the pointer has neglible cost.

Another way of looking at it is that you would need to have 250 or so
lists in memory at the same time before the extra pointer was even
costing you kilobytes of memory. My consumer laptop has 3027908k of
memory.

From: Steve Howell on 25 Jan 2010 13:00

On Jan 24, 10:07 pm, Steven D'Aprano
<ste...(a)REMOVE.THIS.cybersource.com.au> wrote:
> On Sun, 24 Jan 2010 20:12:11 -0800, Steve Howell wrote:
> >> > The most ambitious proposal is to fix the memory manager itself to
> >> > allow the release of memory from the start of the chunk.
>
> >> That's inappropriate given the memory fragmentation it would cause.
>
> > Bullshit. Memory managers consolidate free memory chunks all the time.
> > That's their job.
>
> So let me get this straight...
>
> You've complained that Python's list.pop(0) is lame because it moves
> memory around. And your solution to that is to have the memory manager
> move the memory around instead?
>
> Perhaps I'm missing something, but I don't see the advantage here. At
> best, you consolidate all those moves you wanted to avoid and do them all
> at once instead of a few at a time. At worst, you get a situation where
> the application periodically, and unpredictably, grinds to a halt while
> the memory manager tries to defrag all those lists.
>

You are misunderstanding what I meant, because I did not explain it
very well. When you release memory from the front of the list, if the
memory before it was also free, the memory manager could consolidate
the two chunks as one free chunk.

There is no rational scenario where the memory manager grinds to a
halt tries to "defrag all those lists."

Of course, once the list gets fully garbage collected, then entire
chunk of memory is freed up.

> >> Your approach of snarling against list is not persuading anyone that
> >> list needs to be changed, because most everyone is satisfied with the
> >> existing solution.
>
> > Please provide evidence of that. I am pretty sure that everybody who
> > chooses alternatives to Python would disagree.
>
> Do you honestly believe that "everybody" who prefers another language
> over Python does so because they dislike the performance of list.pop(0)?
>

No I don't believe any statement that makes gross generalizations, so
I also don't believe "most everyone is satisfied with the existing
solution."

> >> You might change approaches and discuss deque, what's wrong with it,
> >> and whether it can be fixed. Getting a change approved for deque is
> >> probably much easier than getting one approved for list, just because
> >> nowhere near as many things depend on deque's performance.
>
> > Again...I am not looking to improve deque, which is a perfectly valid
> > data structure for a limited set of problems.
>
> >> And when discussing performance in this contextc additive constants do
> >> matter.
>
> > Wrong again. Operations that mutate lists are already expensive, and a
> > few checks to see if unreleased memory can be reclaimed are totally
> > NEGLIGIBLE.
>
> Popping from the end of the list isn't expensive. Reversing lists is
> relatively cheap. In-place modifications are very cheap.
>

I am talking in relative terms here. I am saying that checking a
single flag in C code isn't gonna significantly slow down any
operation that calls list_resize(). Delete operations would already
be doing a memmove operation, and insert operations already have to
decide whether to optimistically allocate memory and create the new
list element.

Regarding the extra use of memory, I addressed this in my prior
posting.

Here is code for list_resize:

static int
list_resize(PyListObject *self, Py_ssize_t newsize)
{
PyObject **items;
size_t new_allocated;
Py_ssize_t allocated = self->allocated;

/* Bypass realloc() when a previous overallocation is large enough
to accommodate the newsize. If the newsize falls lower than half
the allocated size, then proceed with the realloc() to shrink the
list.
*/
if (allocated >= newsize && newsize >= (allocated >> 1)) {
assert(self->ob_item != NULL || newsize == 0);
Py_SIZE(self) = newsize;
return 0;
}

/* This over-allocates proportional to the list size, making room
* for additional growth. The over-allocation is mild, but is
* enough to give linear-time amortized behavior over a long
* sequence of appends() in the presence of a poorly-performing
* system realloc().
* The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
*/
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

/* check for integer overflow */
if (new_allocated > PY_SIZE_MAX - newsize) {
PyErr_NoMemory();
return -1;
} else {
new_allocated += newsize;
}

if (newsize == 0)
new_allocated = 0;
items = self->ob_item;
if (new_allocated <= ((~(size_t)0) / sizeof(PyObject *)))
PyMem_RESIZE(items, PyObject *, new_allocated);
else
items = NULL;
if (items == NULL) {
PyErr_NoMemory();
return -1;
}
self->ob_item = items;
Py_SIZE(self) = newsize;
self->allocated = new_allocated;
return 0;
}

Here is the code for list_ass_slice:

static int
list_ass_slice(PyListObject *a, Py_ssize_t ilow, Py_ssize_t ihigh,
PyObject *v)
{
/* Because [X]DECREF can recursively invoke list operations on
this list, we must postpone all [X]DECREF activity until
after the list is back in its canonical shape. Therefore
we must allocate an additional array, 'recycle', into which
we temporarily copy the items that are deleted from the
list. :-( */
PyObject *recycle_on_stack[8];
PyObject **recycle = recycle_on_stack; /* will allocate more if
needed */
PyObject **item;
PyObject **vitem = NULL;
PyObject *v_as_SF = NULL; /* PySequence_Fast(v) */
Py_ssize_t n; /* # of elements in replacement list */
Py_ssize_t norig; /* # of elements in list getting replaced */
Py_ssize_t d; /* Change in size */
Py_ssize_t k;
size_t s;
int result = -1; /* guilty until proved innocent */
#define b ((PyListObject *)v)
if (v == NULL)
n = 0;
else {
if (a == b) {
/* Special case "a[i:j] = a" -- copy b first */
v = list_slice(b, 0, Py_SIZE(b));
if (v == NULL)
return result;
result = list_ass_slice(a, ilow, ihigh, v);
Py_DECREF(v);
return result;
}
v_as_SF = PySequence_Fast(v, "can only assign an iterable");
if(v_as_SF == NULL)
goto Error;
n = PySequence_Fast_GET_SIZE(v_as_SF);
vitem = PySequence_Fast_ITEMS(v_as_SF);
}
if (ilow < 0)
ilow = 0;
else if (ilow > Py_SIZE(a))
ilow = Py_SIZE(a);

if (ihigh < ilow)
ihigh = ilow;
else if (ihigh > Py_SIZE(a))
ihigh = Py_SIZE(a);

norig = ihigh - ilow;
assert(norig >= 0);
d = n - norig;
if (Py_SIZE(a) + d == 0) {
Py_XDECREF(v_as_SF);
return list_clear(a);
}
item = a->ob_item;
/* recycle the items that we are about to remove */
s = norig * sizeof(PyObject *);
if (s > sizeof(recycle_on_stack)) {
recycle = (PyObject **)PyMem_MALLOC(s);
if (recycle == NULL) {
PyErr_NoMemory();
goto Error;
}
}
memcpy(recycle, &item[ilow], s);

if (d < 0) { /* Delete -d items */
memmove(&item[ihigh+d], &item[ihigh],
(Py_SIZE(a) - ihigh)*sizeof(PyObject *));
list_resize(a, Py_SIZE(a) + d);
item = a->ob_item;
}
else if (d > 0) { /* Insert d items */
k = Py_SIZE(a);
if (list_resize(a, k+d) < 0)
goto Error;
item = a->ob_item;
memmove(&item[ihigh+d], &item[ihigh],
(k - ihigh)*sizeof(PyObject *));
}
for (k = 0; k < n; k++, ilow++) {
PyObject *w = vitem[k];
Py_XINCREF(w);
item[ilow] = w;
}
for (k = norig - 1; k >= 0; --k)
Py_XDECREF(recycle[k]);
result = 0;
Error:
if (recycle != recycle_on_stack)
PyMem_FREE(recycle);
Py_XDECREF(v_as_SF);
return result;
#undef b
}

From: Steve Howell on 25 Jan 2010 13:13

On Jan 25, 9:31 am, Steve Howell <showel...(a)yahoo.com> wrote:
> On Jan 24, 11:24 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote:
>
> > Steve Howell <showel...(a)yahoo.com> writes:
> > > There is nothing wrong with deque, at least as far as I know, if the
> > > data strucure actually applies to your use case. It does not apply to
> > > my use case.
>
> > You haven't explained why deque doesn't apply to your use case. Until a
> > convincing explanation emerges, the sentiment you're creating seems to
> > be "what's wrong with that guy and why doesn't he just use deque?". So,
> > why aren't you using deque? If deque somehow isn't adequate for your
> > use case, maybe it can be improved.
>
> These are the reasons I am not using deque:
>
> 1) I want to use native lists, so that downstream methods can use
> them as lists.
> 2) Lists are faster for accessing elements.
> 3) I want to be able to insert elements into the middle of the list.
> 4) I have no need for rotating elements.
>
> I also have reasons for not using the other workarounds, such as
> reversing the list.
>
> > >> And when discussing performance in this context, additive constants
> > >> do matter.
> > > Wrong again. Operations that mutate lists are already expensive
>
> > I'm talking about memory consumption, which is part of Python's concept
> > of performance. You're proposing adding a word or two to every list,
> > with insufficient justification presented so far. Any such
> > justification would have to include a clear and detailed explanation of
> > why using deque is insufficient, so that would be a good place to start..
>
> Adding a word or two to a list is an O(1) addition to a data structure
> that takes O(N) memory to begin with.
>
> That extra pointer should really be taken not just in context of the
> list itself taking O(N) memory, but also the fact that all the
> elements in the list are also consuming memory (until they get popped
> off). So adding the pointer has neglible cost.
>
> Another way of looking at it is that you would need to have 250 or so
> lists in memory at the same time before the extra pointer was even
> costing you kilobytes of memory. My consumer laptop has 3027908k of
> memory.

I should also point out that my telephone has gigabytes of memory.
It's a fairly expensive device, but I regularly carry multiple
gigabytes of memory around in my front pants pocket.

There are some valid reasons to reject a proposal to make deleting
elements off the top of the list be O(1).

Memory consumption is not one of them.

Even the most naive patch to make pop(0) and del lst[0] advance the
pointer would eventually reclaim memory once the list is garbage
collected. Also, by allowing users to pop elements off the list
without a memmove, you encourage users to discard elements earlier in
the process, which means you can amortize the garbage collection for
the list elements themselves (i.e. less spiky), and do it earlier.

From: Steve Howell on 25 Jan 2010 13:24

On Jan 24, 1:51 pm, Daniel Stutzbach <dan...(a)stutzbachenterprises.com>
wrote:
> On Sun, Jan 24, 2010 at 1:53 PM, Steve Howell <showel...(a)yahoo.com> wrote:
> > I don't think anybody provided an actual link, but please correct me
> > if I overlooked it.
>
> I have to wonder if my messages are all ending up in your spam folder
> for some reason. :-)
>
> PEP 3128 (which solves your problem, but not using the implementation
> you suggest)http://www.python.org/dev/peps/pep-3128/
>
> Implementation as an extension module:http://pypi.python.org/pypi/blist/
>
> Related discussion:http://mail.python.org/pipermail/python-3000/2007-April/006757.htmlhttp://mail.python.org/pipermail/python-3000/2007-May/007491.html
> be
> Detailed performance comparison:http://stutzbachenterprises.com/performance-blist
>
> I maintain a private fork of Python 3 with the blist replacing the
> regular list, as a way of rigorously testing the blist implementation.
>
> Although I originally proposed a PEP, I am content to have the blist
> exist as a third-party module.
>

Hi Daniel, I agree with what Raymond Hettinger says toward the top of
the PEP. Blist, while extremely useful, does seem to have to trade
off performance of common operations, notably "get item", in order to
get better performance for other operations (notably insert/delete).
My algorithm does exactly N pops and roughly N list accesses, so I
would be going from N*N + N to N + N log N if switched to blist. That
would be at least a theoretical gain over the current performance, but
if pop() were O(1), I could get the whole thing down to N time.

From: Paul Rubin on 25 Jan 2010 16:00

Steve Howell <showell30(a)yahoo.com> writes:
> These are the reasons I am not using deque:

Thanks for these. Now we are getting somewhere.

> 1) I want to use native lists, so that downstream methods can use
> them as lists.

It sounds like that could be fixed by making the deque API a proper
superset of the list API.

> 2) Lists are faster for accessing elements.

It sounds like that could be fixed by optimizing deque somewhat. Also,
have you profiled your application to show that accessing list elements
is actually using a significant fraction of its runtime and that it
would be slowed down noticably by deque? If not, it's a red herring.

> 3) I want to be able to insert elements into the middle of the list.

I just checked, and was surprised to find that deque doesn't support
this. I'd say go ahead and file a feature request to add it to deque.

> 4) I have no need for rotating elements.

That's unpersuasive since you're advocating adding a feature to list
that many others have no need for.

> Adding a word or two to a list is an O(1) addition to a data structure
> that takes O(N) memory to begin with.

Yes, as mentioned, additive constants matter.

> Another way of looking at it is that you would need to have 250 or so
> lists in memory at the same time before the extra pointer was even
> costing you kilobytes of memory.

I've often run applications with millions of lists, maybe tens of
millions. Of course it would be 100's of millions if the machines were
big enough.

> My consumer laptop has 3027908k of memory.

I thought the idea of buying bigger machines was to solve bigger
problems, not to solve the same problems more wastefully.

First | Prev | Next | Last
Pages: 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Prev: distutils not finding all of my pure python modules
Next: iterating lists