From: Paul Rubin on 25 Jan 2010 23:31 Steve Howell <showell30(a)yahoo.com> writes: > I haven't profiled deque vs. list, but I think you are correct about > pop() possibly being a red herring.... > For really large lists, I suppose memmove() would eventually start to > become a bottleneck, but it's brutally fast when it just moves a > couple kilobytes of data around. One way to think of Python is as a scripting wrapper around a bunch of C functions, rather than as a full-fledged programming language. Viewed that way, list operations like pop(0) are essentially constant time unless the list is quite large. By that I mean you can implement classic structures like doubly-linked lists using Python tuples, but even though inserting into the middle of them is theoretically O(1), the memmove's of the native list operations will be much faster in practice. Programs dealing with large lists (more than a few thousand elements) are obviously different and if your program is using such large lists, you have to plan a little differently when writing the code. >> I've often run applications with millions of lists > I bet even in your application, the amount of memory consumed by the > PyListObjects themselves is greatly dwarfed by other objects, notably > the list elements themselves Such lists often would just one element or even be empty. For example, you might have a dictionary mapping names to addresses. Most people have just one address, but some might have no address, and a few might have more than one address, so you would have a list of addresses for each name. Of course the dictionary slots in that example would also use space. > Well, I am not trying to solve problems wastefully here. CPU cycles > are also scarce, so it seems wasteful to do an O(N) memmove that could > be avoided by storing an extra pointer per list. Realistically the CPython interpreter is so slow that the memmove is unnoticable, and Python (at least CPython) just isn't all that conductive to writing fast code. It makes up for this in programmer productivity for the many sorts of problems in which moderate speed is acceptable. > Thanks for your patience in responding to me, despite the needlessly > abrasive tone of my earlier postings. I wondered whether you might have come over from the Lisp newsgroups, which are pretty brutal. We try to be friendlier here (not that we're always successful). Anyway, welcome. > 1) Summarize all this discussion and my lessons learned in some kind > of document. It does not have to be a PEP per se, but I could provide > a useful service to the community by listing pros/cons/etc. I suppose that can't hurt, but there are probably other areas (multicore parallelism is a perennial one) of much higher community interest. http://wiki.python.org/moin/ is probably a good place to put such a document. > 2) I would still advocate for removing the warning against list.pop > (0) from the tutorial. I agree with Steven D'Aprano that docs really > should avoid describing implementation details in many instances On general principles I agree with Alex Stepanov that the running time of a function should be part of its interface (nobody wants to use a stack of popping an element takes quadratic time) and therefore should be stated in the docs. Python just has a weird incongruence between the interpreter layer and the C layer, combined with a library well-evolved for everyday problem sizes, so the traditional asymptotic approach to algorithm selection often doesn't give the best practical choice. I don't feel like looking up what the tutorial says about pop(0), but if it just warns against it without qualification, it should probably be updated.
From: Steve Howell on 26 Jan 2010 00:00 On Jan 24, 11:28 am, a...(a)pythoncraft.com (Aahz) wrote: > In article <b4440231-f33f-49e1-9d6f-5fbce0a63...(a)b2g2000yqi.googlegroups.com>, > Steve Howell <showel...(a)yahoo.com> wrote: > > > > >Even with realloc()'s brokenness, you could improve pop(0) in a way > >that does not impact list access at all, and the patch would not change > >the time complexity of any operation; it would just add negligible > >extract bookkeeping within list_resize() and a few other places. > > Again, your responsibility is to provide a patch and a spectrum of > benchmarking tests to prove it. Then you would still have to deal with > the objection that extensions use the list internals -- that might be an > okay sell given the effort otherwise required to port extensions to > Python 3, but that's not the way to bet. > Ok, I just submitted a patch to python-dev that illustrates a 100x speedup on an admittedly artificial program. It still has a long way to go, but it demonstrates proof of concept. I'm done for the day, but tomorrow I will try to polish it up and improve it, even if its doomed for rejection. Apologies to all I have offended in this thread. I frankly found some of the pushback to be a bit hasty and disrespectful, but I certainly overreacted to some of the criticism. And now I'm in the awkward position of asking the people I offended to help me with the patch. If anybody can offer me a hand in understanding some of CPython's internals, particularly with regard to memory management, it would be greatly appreciated. (Sorry I don't have a link to the python-dev posting; it is not showing up in the archives yet for some reason.)
From: Steve Howell on 26 Jan 2010 00:13 On Jan 25, 8:31 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote: > Steve Howell <showel...(a)yahoo.com> writes: > > I haven't profiled deque vs. list, but I think you are correct about > > pop() possibly being a red herring.... > > For really large lists, I suppose memmove() would eventually start to > > become a bottleneck, but it's brutally fast when it just moves a > > couple kilobytes of data around. > > One way to think of Python is as a scripting wrapper around a bunch of C > functions, rather than as a full-fledged programming language. Viewed > that way, list operations like pop(0) are essentially constant time > unless the list is quite large. By that I mean you can implement > classic structures like doubly-linked lists using Python tuples, but > even though inserting into the middle of them is theoretically O(1), the > memmove's of the native list operations will be much faster in practice. > Programs dealing with large lists (more than a few thousand elements) > are obviously different and if your program is using such large lists, > you have to plan a little differently when writing the code. Thanks. That is a good way of looking at. > > Realistically the CPython interpreter is so slow that the memmove is > unnoticable, and Python (at least CPython) just isn't all that > conductive to writing fast code. It makes up for this in programmer > productivity for the many sorts of problems in which moderate speed > is acceptable. > Definitely, and moderate speed is enough in a surprisingly large number of applications. > > Thanks for your patience in responding to me, despite the needlessly > > abrasive tone of my earlier postings. > > I wondered whether you might have come over from the Lisp newsgroups, > which are pretty brutal. We try to be friendlier here (not that we're > always successful). Anyway, welcome. > :) > > 1) Summarize all this discussion and my lessons learned in some kind > > of document. It does not have to be a PEP per se, but I could provide > > a useful service to the community by listing pros/cons/etc. > > I suppose that can't hurt, but there are probably other areas (multicore > parallelism is a perennial one) of much higher community interest. > > http://wiki.python.org/moin/is probably a good place to put such > a document. > Ok, that's where I'll start. > > 2) I would still advocate for removing the warning against list.pop > > (0) from the tutorial. I agree with Steven D'Aprano that docs really > > should avoid describing implementation details in many instances > > On general principles I agree with Alex Stepanov that the running time > of a function should be part of its interface (nobody wants to use a > stack of popping an element takes quadratic time) and therefore should > be stated in the docs. Python just has a weird incongruence between the > interpreter layer and the C layer, combined with a library well-evolved > for everyday problem sizes, so the traditional asymptotic approach to > algorithm selection often doesn't give the best practical choice. > > I don't feel like looking up what the tutorial says about pop(0), but if > it just warns against it without qualification, it should probably > be updated. Here it is: http://docs.python.org/tutorial/datastructures.html#using-lists-as-queues My opinion is that the warning should be either removed or qualified, but it is probably fine as written. ''' It is also possible to use a list as a queue, where the first element added is the first element retrieved (first-in, first-out); however, lists are not efficient for this purpose. While appends and pops from the end of list are fast, doing inserts or pops from the beginning of a list is slow (because all of the other elements have to be shifted by one). ''' The qualifications would be that deque lacks some features that list has, and that the shift-by-one operation is actually a call to memmove () and may not apply to all implementations.
From: Steve Holden on 26 Jan 2010 11:57 Steve Howell wrote: > On Jan 24, 11:28 am, a...(a)pythoncraft.com (Aahz) wrote: >> In article <b4440231-f33f-49e1-9d6f-5fbce0a63...(a)b2g2000yqi.googlegroups.com>, >> Steve Howell <showel...(a)yahoo.com> wrote: >> >> >> >>> Even with realloc()'s brokenness, you could improve pop(0) in a way >>> that does not impact list access at all, and the patch would not change >>> the time complexity of any operation; it would just add negligible >>> extract bookkeeping within list_resize() and a few other places. >> Again, your responsibility is to provide a patch and a spectrum of >> benchmarking tests to prove it. Then you would still have to deal with >> the objection that extensions use the list internals -- that might be an >> okay sell given the effort otherwise required to port extensions to >> Python 3, but that's not the way to bet. >> > > Ok, I just submitted a patch to python-dev that illustrates a 100x > speedup on an admittedly artificial program. It still has a long way > to go, but it demonstrates proof of concept. I'm done for the day, > but tomorrow I will try to polish it up and improve it, even if its > doomed for rejection. Apologies to all I have offended in this > thread. I frankly found some of the pushback to be a bit hasty and > disrespectful, but I certainly overreacted to some of the criticism. > And now I'm in the awkward position of asking the people I offended to > help me with the patch. If anybody can offer me a hand in > understanding some of CPython's internals, particularly with regard to > memory management, it would be greatly appreciated. > > (Sorry I don't have a link to the python-dev posting; it is not > showing up in the archives yet for some reason.) > > Fortunately for you this is a very welcoming group, and particularly responsive to individuals who have seen the error of their ways ;-) regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 PyCon is coming! Atlanta, Feb 2010 http://us.pycon.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS: http://holdenweb.eventbrite.com/
From: Arnaud Delobelle on 27 Jan 2010 02:34
Steve Howell <showell30(a)yahoo.com> writes: > On Jan 25, 1:32 pm, Arnaud Delobelle <arno...(a)googlemail.com> wrote: >> Steve Howell <showel...(a)yahoo.com> writes: >> >> [...] >> >> > My algorithm does exactly N pops and roughly N list accesses, so I >> > would be going from N*N + N to N + N log N if switched to blist. >> >> Can you post your algorithm? It would be interesting to have a concrete >> use case to base this discussion on. >> > > I just realized you meant the Python code itself. It is here: > > https://bitbucket.org/showell/shpaml_website/src/tip/shpaml.py Hi - I didn't have the time to look at it before today. You scan through a list called prefix_lines, and you use pop(0) as a means to keep track of your position in this list. It seems to me that it would be more effective to explicitely keep track of it - and it would remove the need to the numerous copies of sublists of indent_lines that you have to create. I have rewritten part of the three relevant functions (html_block_tag, get_indented_block and recurse, within indent_lines) to show you what I mean (see below). I have tried to keep the changes to a minimum - you can see that the code is no more complex like this. The advantage is that only one list is created, prefix_lines, and there is no need to mutate it or copy parts of it during the algorithm. I have not had the time to test it though (only on one of the examples on the examples webpage). Code follows: [...] def html_block_tag(output, block, start, end, recurse): append = output.append prefix, tag = block[start] if RAW_HTML.regex.match(tag): append(prefix + tag) recurse(block, start + 1, end) else: start_tag, end_tag = apply_jquery_sugar(tag) append(prefix + start_tag) recurse(block, start + 1, end) append(prefix + end_tag) [...] def get_indented_block(prefix_lines, start, end): prefix, line = prefix_lines[start] len_prefix = len(prefix) i = start + 1 while i < end: new_prefix, line = prefix_lines[i] if line and len(new_prefix) <= len_prefix: break i += 1 while i-1 > start and prefix_lines[i-1][1] == '': i -= 1 return i [...] def indent_lines(lines, output, branch_method, leaf_method, pass_syntax, flush_left_syntax, flush_left_empty_line, indentation_method, get_block, ): append = output.append def recurse(prefix_lines, start, end): while start < end: prefix, line = prefix_lines[start] if line == '': start += 1 append('') else: block_end = get_block(prefix_lines, start, end) if block_end == start + 1: start += 1 if line == pass_syntax: pass elif line.startswith(flush_left_syntax): append(line[len(flush_left_syntax):]) elif line.startswith(flush_left_empty_line): append('') else: append(prefix + leaf_method(line)) else: branch_method(output, prefix_lines, start, block_end, recurse) start = block_end return prefix_lines = map(indentation_method, lines) recurse(prefix_lines, 0, len(prefix_lines)) |