From: Steve Howell on 23 Jan 2010 02:58 On Jan 22, 11:10 pm, a...(a)pythoncraft.com (Aahz) wrote: > In article <83082e19-9130-45a8-91f2-8601c1fda...(a)22g2000yqr.googlegroups.com>, > Steve Howell <showel...(a)yahoo.com> wrote: > > > > > > >I really want to use list *normally* with all its perfectly good > >semantics and reasonable implementation, except for its blind spot > >with respect to popping the first element off the list. The whole > >reason I use CPython vs. C in the first place is that CPython > >programmers can generally program basic data structures better than I > >can. But list.pop(0) is the exception. And, with the possible > >exception of dicts, lists are the most fundamental data structures > >that Python has. > > >I know Python's number one concern will never be speed, but if Python > >makes an O(1) operation into an unnecessarily O(N) operation for no > >good reasons other than "it's too complicated, " or it "adds another > >pointer to the structure," or "it adds another conditional check to > >list_ass_slice for operations that aren't popping off the top," I > >think it's reasonable to challenge the design philosophy. > > "Rough consensus and running code." > > You have a good point, but nobody will ever give your idea serious > attention until there's a patch and benchmarks. Another benchmark is that deques are slower than lists for accessing elements. showell(a)showell-laptop:~$ python foo.py 0.0215361118317 <- list 0.0429010391235 <- deque import time from collections import deque n = 40000 lst = [] for i in range(n): lst.append(i) t = time.time() for i in range(n): lst[i] print time.time() - t lst = deque(lst) t = time.time() for i in range(n): lst[i] print time.time() - t So substituting deque for list suffers not just in convenience, but also in performance.
From: Terry Reedy on 23 Jan 2010 03:13 On 1/23/2010 12:58 AM, Steve Howell wrote: > I really want to use list *normally* with all its perfectly good > semantics and reasonable implementation, except for its blind spot > with respect to popping the first element off the list. It was not designed for that. .pop() was added to lists about 10 years ago because I asked for it (with no parameter, pop off end only) and wrote what would now be a PEP -- and because Tim Peters later supported the idea. Adding the optional parameter was something of an afterthought (never publicly discussed as far as I know) for occasional use for 'short' lists where O(n) is tolerable. You have half persuaded me that that the parameter addition was a mistake. Perhaps is is too attractice a nuisance for some people ;=). > The whole > reason I use CPython vs. C in the first place is that CPython > programmers can generally program basic data structures better than I > can. They have given us three options other that .pop(0). 1. listiterator 2. queue.Queue 3. collections.deque\ Why are you so stubborn about not using a data structure intended for your use case instead of one that was not? There is also 4. a two-list design for queues: iterator through one (or pop() from a reversed version thereof) while appending to another; switch when the first is empty. I plan to write it up with tests some day this year. > I know Python's number one concern will never be speed, but if Python > makes an O(1) operation into an unnecessarily O(N) operation for no > good reasons other than "it's too complicated, " or it "adds another > pointer to the structure," or "it adds another conditional check to > list_ass_slice for operations that aren't popping off the top," I > think it's reasonable to challenge the design philosophy. Challenge yes, mock no. Part of writing good basic data structures is not adding needless complication from featuritis and not penalizing 99.99% of access to satify a .01% need better satisfied another way. Terry Jan Reedy
From: Steve Howell on 23 Jan 2010 03:23 On Jan 23, 12:13 am, Terry Reedy <tjre...(a)udel.edu> wrote: > > Challenge yes, mock no. > > Part of writing good basic data structures is not adding needless > complication from featuritis and not penalizing 99.99% of access to > satify a .01% need better satisfied another way. > I would like to challenge your assertion that advancing ob_item instead of doing memmove during list_ass_slice would impact the performance of list accesses in any way. It would only slow down operations that add/insert items into the list by, and then only by a single conditional statement, and those add/insert operations are already O(N) to begin with.
From: Alf P. Steinbach on 23 Jan 2010 03:32 * Steve Howell: > On Jan 23, 12:13 am, Terry Reedy <tjre...(a)udel.edu> wrote: >> Challenge yes, mock no. >> >> Part of writing good basic data structures is not adding needless >> complication from featuritis and not penalizing 99.99% of access to >> satify a .01% need better satisfied another way. >> > > I would like to challenge your assertion that advancing ob_item > instead of doing memmove during list_ass_slice would impact the > performance of list accesses in any way. It would only slow down > operations that add/insert items into the list by, and then only by a > single conditional statement, and those add/insert operations are > already O(N) to begin with. I'm sorry, no, the last part is incorrect. Appending to a 'list' can currently be constant time, if OS reallocation is constant time (as the string '+' optimization relies on). With the pop optimization it can no longer be constant time without risking an accumulation of unused memory, a memory leak, although it can be amortized constant time, at the cost of wasting some percentage of memory. Cheers & hth., - Alf
From: Steve Howell on 23 Jan 2010 03:54
On Jan 23, 12:32 am, "Alf P. Steinbach" <al...(a)start.no> wrote: > * Steve Howell: > > > On Jan 23, 12:13 am, Terry Reedy <tjre...(a)udel.edu> wrote: > >> Challenge yes, mock no. > > >> Part of writing good basic data structures is not adding needless > >> complication from featuritis and not penalizing 99.99% of access to > >> satify a .01% need better satisfied another way. > > > I would like to challenge your assertion that advancing ob_item > > instead of doing memmove during list_ass_slice would impact the > > performance of list accesses in any way. It would only slow down > > operations that add/insert items into the list by, and then only by a > > single conditional statement, and those add/insert operations are > > already O(N) to begin with. > > I'm sorry, no, the last part is incorrect. > > Appending to a 'list' can currently be constant time, if OS reallocation is > constant time (as the string '+' optimization relies on). > That's true, but it's also irrelevant, as the pop optimization would happen in a branch of code that never gets executed during list appending: if (d < 0) { /* Delete -d items */ /* ADD CODE HERE TO AVOID memmove when ilow == 0 (i.e. ihigh + d == 0) */ memmove(&item[ihigh+d], &item[ihigh], (Py_SIZE(a) - ihigh)*sizeof(PyObject *)); list_resize(a, Py_SIZE(a) + d); item = a->ob_item; } > With the pop optimization it can no longer be constant time without risking an > accumulation of unused memory, a memory leak, although it can be amortized > constant time, at the cost of wasting some percentage of memory. list_resize already overallocates memory to allow room for growth. Whenever you did an append to the list that would force a realloc, you could first check to see if there is unused stuff at the front and do the memmove then and reclaim the unfreed memory. So instead of doing a paying for memmove on every pop, you only pay for it when the list goes to size 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, etc. > |