From: Steve Howell on 22 Jan 2010 14:14 The v2.6.4 version of the tutorial says this: ''' It is also possible to use a list as a queue, where the first element added is the first element retrieved (first-in, first-out); however, lists are not efficient for this purpose. While appends and pops from the end of list are fast, doing inserts or pops from the beginning of a list is slow (because all of the other elements have to be shifted by one). ''' Is that really true in CPython? It seems like you could advance the pointer instead of shifting all the elements. It would create some nuances with respect to reclaiming the memory, but it seems like an easy way to make lists perform better under a pretty reasonable use case. Does anybody know off the top of their head if the "have-to-be-shifted- by-one" warning is actually valid? http://docs.python.org/tutorial/datastructures.html
From: Chris Rebert on 22 Jan 2010 15:14 On Fri, Jan 22, 2010 at 11:14 AM, Steve Howell <showell30(a)yahoo.com> wrote: > The v2.6.4 version of the tutorial says this: > > ''' > It is also possible to use a list as a queue, where the first element > added is the first element retrieved (âfirst-in, first-outâ); however, > lists are not efficient for this purpose. While appends and pops from > the end of list are fast, doing inserts or pops from the beginning of > a list is slow (because all of the other elements have to be shifted > by one). > ''' > > Is that really true in CPython? Â It seems like you could advance the > pointer instead of shifting all the elements. Â It would create some > nuances with respect to reclaiming the memory, but it seems like an > easy way to make lists perform better under a pretty reasonable use > case. > > Does anybody know off the top of their head if the "have-to-be-shifted- > by-one" warning is actually valid? Judging by the "Sorted dictionary" thread responses: Yes. Cheers, Chris -- http://blog.rebertia.com
From: Steve Howell on 22 Jan 2010 15:22 On Jan 22, 12:14 pm, Chris Rebert <c...(a)rebertia.com> wrote: > On Fri, Jan 22, 2010 at 11:14 AM, Steve Howell <showel...(a)yahoo.com> wrote: > > The v2.6.4 version of the tutorial says this: > > > ''' > > It is also possible to use a list as a queue, where the first element > > added is the first element retrieved (first-in, first-out); however, > > lists are not efficient for this purpose. While appends and pops from > > the end of list are fast, doing inserts or pops from the beginning of > > a list is slow (because all of the other elements have to be shifted > > by one). > > ''' > > > Is that really true in CPython? It seems like you could advance the > > pointer instead of shifting all the elements. It would create some > > nuances with respect to reclaiming the memory, but it seems like an > > easy way to make lists perform better under a pretty reasonable use > > case. > > > Does anybody know off the top of their head if the "have-to-be-shifted- > > by-one" warning is actually valid? > > Judging by the "Sorted dictionary" thread responses: Yes. > I think you are referring to this comment: ''' Insertion and deletion are still O(n) as all items to the right of the inserted/deleted one have to be shifted by one place. ''' http://groups.google.com/group/comp.lang.python/browse_thread/thread/d3699724d94d5b5a I can certainly see why most reasonable implementations of a list would have insertion/deletion in the middle of the list be O(N), but I don't think that limitation has to apply for the special cases of the beginning and end of the list.
From: Christian Heimes on 22 Jan 2010 15:40 Steve Howell wrote: > Is that really true in CPython? It seems like you could advance the > pointer instead of shifting all the elements. It would create some > nuances with respect to reclaiming the memory, but it seems like an > easy way to make lists perform better under a pretty reasonable use > case. > > Does anybody know off the top of their head if the "have-to-be-shifted- > by-one" warning is actually valid? Why do you think the documentation has such obvious errors? CPython's list type uses an array of pointers to store its members. The type is optimized for the most common list operations in Python: iteration and appending. Python code rarely changes the head or middle of a list. The dequeue type is an optimized data structure for popping and inserting data at both ends. When you list.pop() or list.insert() the pointers in the internal array must be shifted. appending is much faster because the internal array is overallocated meaning it contains free slots at the tail of the data structure. A linked list of pointers requires more memory and iteration is slower since since it can't utilize the CPU's cache as good as an array. Christian
From: Steve Howell on 22 Jan 2010 15:57
On Jan 22, 12:40 pm, Christian Heimes <li...(a)cheimes.de> wrote: > Steve Howell wrote: > > Is that really true in CPython? It seems like you could advance the > > pointer instead of shifting all the elements. It would create some > > nuances with respect to reclaiming the memory, but it seems like an > > easy way to make lists perform better under a pretty reasonable use > > case. > > > Does anybody know off the top of their head if the "have-to-be-shifted- > > by-one" warning is actually valid? > > Why do you think the documentation has such obvious errors? I wasn't making any assumptions, hence the question mark. The Python docs are very good, but even the best of projects make advances that aren't reflected in the docs. > CPython's list type uses an array of pointers to store its members. The > type is optimized for the most common list operations in Python: > iteration and appending. Python code rarely changes the head or middle > of a list. The dequeue type is an optimized data structure for popping > and inserting data at both ends. > I disagree that Python code rarely pops elements off the top of a list. There are perfectly valid use cases for wanting a list over a dequeue without having to pay O(N) for pop(0). Maybe we are just quibbling over the meaning of "rarely." > When you list.pop() or list.insert() the pointers in the internal array > must be shifted. appending is much faster because the internal array is > overallocated meaning it contains free slots at the tail of the data > structure. A linked list of pointers requires more memory and iteration > is slower since since it can't utilize the CPU's cache as good as an array. > I am not proposing a linked list of pointers. I am wondering about something like this: p = &p[1]; (and then reclaim p[0] as free memory, I already said I understood that was the tricky bit) The pointer arithmetic for accessing each element would still work in O (1), right? |