From: Steve Howell on 24 Jan 2010 14:26 On Jan 23, 3:04 pm, Terry Reedy <tjre...(a)udel.edu> wrote: > On 1/23/2010 12:17 PM, Steve Howell wrote: > > > Terry Reedy said: > > > ''' > > If you try writing a full patch, as I believe someone did, or at least > > a > > prototype thereof, when the idea was discussed, you might have a > > better > > idea of what the tradeoffs are and why it was rejected. > > ''' > > > I have to run, but tomorrow I will try to dig through python-dev > > archives and find the patch. If anybody has hints on where to look > > for it (anybody remember the author, for example?), it would be much > > appreciated. > > The approach you outlined in your other response to me is, I believe, > what was considered, investigated, and then rejected (by Guido, with > agreement). The discussion may have been on the now-closed and > (misspelled) pyk3 (?), or maybe on python-ideas, but my memory is more > likely the former. I am sure that Raymond H. was involved also. > > > If the patch looks simple, I will try to pitch the idea that its time > > has come. Now that the specification of the language itself is > > frozen, I think there might be more room for improving > > implementations. Also, I might be able to make the argument that > > tradeoffs of memory vs. CPU vs. code complexity have different forces > > in the 2010s. > > I am not opposed to a possible change, just hasty, ill-informed > criticism. If there is not a PEP on this issue, it would be good to have > one that recorded the proposal and the pros and cons, regardless of the > outcome, so there would be something to refer people to. If that had > been already done, it would have shortened this thread considerably. > I think it's a good idea to write a PEP on this issue, and I will attempt a first draft. I think I should submit the first draft to python-ideas, correct? I expect the PEP to be at least initially, if not permanently, rejected, but it would not be an exercise in futility, as I agree it's good to record pros and cons of the proposal in one place. The PEP probably would not include a proposed patch until there was a little bit of consensus behind it, but it would not take me a lot of time to present the basic argument. Here is my sketch of what the PEP would look like. Proposal: Improve list's implementation so that deleting elements from the front of the list does not require an O(N) memmove operation. Rationale: Some Python programs that process lists have multiple methods that consume the first element of the list and pop it off. The pattern comes up with parsers in particular, but there are other examples. It is possible now, of course, to use a data structure in Python that has O(1) for deleting off the top of the list, but none of the alternatives fully replicate the benefits of list itself. Specification: Improving CPython's performance does not affect the language itself, so there are no bikeshed arguments to be had with respect to syntax, etc. Any patch would, of course, affect the performance of nearly every Python program in existence, so any patch would have to, at a bare minimum: 1) Not increase the time or memory complexity of any other list operation. 2) Not affect list access at all. 3) Minimally affect list operations that mutate the list. 4) Be reasonably simple within CPython itself. 5) Not be grossly wasteful of memory. Backwards Compatibility: See above. An implementation of this PEP would not change the definition of the language in any way, but it would have to minimally impact the performance of lists for the normal use cases. Implementation: There are two ways to make deleting the first item of the list run more efficiently. The most ambitious proposal is to fix the memory manager itself to allow the release of memory from the start of the chunk. The advantage of this proposal is that it would simplify the changes to list itself, and possibly have collateral benefits for other CPython internal data structures. The disadvantage of the proposal is that there is a strong tradition in CPython to use native memory management, particularly with respect to the fact that it runs on many platforms. The less ambitious proposal is to change the memory management scheme within list itself. There is already precedent in list_resize() to optimistically allocate memory, so it is not a great break from tradition to optimistically defer the release of memory. But it would complicate things. References: I would refer to this thread on comp.lang.python for discussion, and I would also try to dig up older threads on python-dev or elsewhere.
From: Aahz on 24 Jan 2010 14:28 In article <b4440231-f33f-49e1-9d6f-5fbce0a63bd2(a)b2g2000yqi.googlegroups.com>, Steve Howell <showell30(a)yahoo.com> wrote: > >Even with realloc()'s brokenness, you could improve pop(0) in a way >that does not impact list access at all, and the patch would not change >the time complexity of any operation; it would just add negligible >extract bookkeeping within list_resize() and a few other places. Again, your responsibility is to provide a patch and a spectrum of benchmarking tests to prove it. Then you would still have to deal with the objection that extensions use the list internals -- that might be an okay sell given the effort otherwise required to port extensions to Python 3, but that's not the way to bet. Have you actually read the discussions you were pointed at? -- Aahz (aahz(a)pythoncraft.com) <*> http://www.pythoncraft.com/ import antigravity
From: Steve Howell on 24 Jan 2010 14:53 On Jan 24, 11:28 am, a...(a)pythoncraft.com (Aahz) wrote: > In article <b4440231-f33f-49e1-9d6f-5fbce0a63...(a)b2g2000yqi.googlegroups.com>, > Steve Howell <showel...(a)yahoo.com> wrote: > > > > >Even with realloc()'s brokenness, you could improve pop(0) in a way > >that does not impact list access at all, and the patch would not change > >the time complexity of any operation; it would just add negligible > >extract bookkeeping within list_resize() and a few other places. > > Again, your responsibility is to provide a patch and a spectrum of > benchmarking tests to prove it. Then you would still have to deal with > the objection that extensions use the list internals -- that might be an > okay sell given the effort otherwise required to port extensions to > Python 3, but that's not the way to bet. Ok. > Have you actually read the discussions you were pointed at? I don't think anybody provided an actual link, but please correct me if I overlooked it.
From: Terry Reedy on 24 Jan 2010 15:13 On 1/24/2010 2:26 PM, Steve Howell wrote: > I think it's a good idea to write a PEP on this issue, and I will > attempt a first draft. I think I should submit the first draft to > python-ideas, correct? That is not a *requirement* for drafts in general, but it is a good idea for a community or community-person generated proposal, such as this one. > I expect the PEP to be at least initially, if not permanently, > rejected, Guido sometimes rejects 'no-chance' proposals without waiting to be asked, but he usually waits until the PEP author feels the issue is ripe and asks for a pronouncement. tjr
From: Paul Rubin on 24 Jan 2010 15:44
Steve Howell <showell30(a)yahoo.com> writes: > Proposal: Improve list's implementation so that deleting elements from > the front of the list does not require an O(N) memmove operation. ... > It is possible now, of course, to use a data structure in > Python that has O(1) for deleting off the top of the list, but none of > the alternatives fully replicate the benefits of list itself. I think you are mostly referring to deque. Why don't you instead say what you think is wrong with using deque, and how deque can be improved? > See above. An implementation of this PEP would not change the > definition of the language in any way, but it would have to minimally > impact the performance of lists for the normal use cases. But you're talking about adding one or two words to EVERY list, and many normal use cases allocate a LOT of lists. Those use cases are likely more common than use cases that pop from the front of the list but for some reason can't use deque. > The most ambitious proposal is to fix the memory manager itself to > allow the release of memory from the start of the chunk. That's inappropriate given the memory fragmentation it would cause. Really, you're describing a problem that arises in a few programs but up til now, as far as I know, everyone has found deque to be an adequate solution. Your approach of snarling against list is not persuading anyone that list needs to be changed, because most everyone is satisfied with the existing solution. You might change approaches and discuss deque, what's wrong with it, and whether it can be fixed. Getting a change approved for deque is probably much easier than getting one approved for list, just because nowhere near as many things depend on deque's performance. And when discussing performance in this context, additive constants do matter. |