From: Arnaud Delobelle on 25 Jan 2010 16:32 Steve Howell <showell30(a)yahoo.com> writes: [...] > My algorithm does exactly N pops and roughly N list accesses, so I > would be going from N*N + N to N + N log N if switched to blist. Can you post your algorithm? It would be interesting to have a concrete use case to base this discussion on. -- Arnaud
From: Steve Howell on 25 Jan 2010 17:05 On Jan 25, 1:32 pm, Arnaud Delobelle <arno...(a)googlemail.com> wrote: > Steve Howell <showel...(a)yahoo.com> writes: > > [...] > > > My algorithm does exactly N pops and roughly N list accesses, so I > > would be going from N*N + N to N + N log N if switched to blist. > > Can you post your algorithm? It would be interesting to have a concrete > use case to base this discussion on. > It is essentially this, in list_ass_slice: if (d < 0) { /* Delete -d items */ if (ilow == 0) { a->popped -= d; a->ob_item -= d * sizeof(PyObject *); list_resize(a, Py_SIZE(a)); } else { memmove(&item[ihigh+d], &item[ihigh], (Py_SIZE(a) - ihigh)*sizeof(PyObject *)); list_resize(a, Py_SIZE(a) + d); } item = a->ob_item; } I am still working through the memory management issues, but when I have a complete working patch, I will give more detail.
From: Steve Howell on 25 Jan 2010 17:09 On Jan 25, 1:32 pm, Arnaud Delobelle <arno...(a)googlemail.com> wrote: > Steve Howell <showel...(a)yahoo.com> writes: > > [...] > > > My algorithm does exactly N pops and roughly N list accesses, so I > > would be going from N*N + N to N + N log N if switched to blist. > > Can you post your algorithm? It would be interesting to have a concrete > use case to base this discussion on. > I just realized you meant the Python code itself. It is here: https://bitbucket.org/showell/shpaml_website/src/tip/shpaml.py
From: Steve Howell on 25 Jan 2010 18:00 On Jan 25, 1:00 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote: > Steve Howell <showel...(a)yahoo.com> writes: > > These are the reasons I am not using deque: > > Thanks for these. Now we are getting somewhere. > > > 1) I want to use native lists, so that downstream methods can use > > them as lists. > > It sounds like that could be fixed by making the deque API a proper > superset of the list API. That is probably a good idea. > > 2) Lists are faster for accessing elements. > > It sounds like that could be fixed by optimizing deque somewhat. Also, > have you profiled your application to show that accessing list elements > is actually using a significant fraction of its runtime and that it > would be slowed down noticably by deque? If not, it's a red herring. I haven't profiled deque vs. list, but I think you are correct about pop() possibly being a red herring. It appears that the main bottleneck might still be the processing I do on each line of text, which in my cases is regexes. For really large lists, I suppose memmove() would eventually start to become a bottleneck, but it's brutally fast when it just moves a couple kilobytes of data around. > > 3) I want to be able to insert elements into the middle of the list.. > > I just checked, and was surprised to find that deque doesn't support > this. I'd say go ahead and file a feature request to add it to deque. > It might be a good thing to add just for consistency sake. If somebody first implements an algorithm with lists, then discovers it has overhead relating to inserting/appending at the end of the list, then the more deque behaves like a list, the more easily they could switch over their code to deque. Not knowing much about deque's internals, I assume its performance for insert() would O(N) just like list, although maybe a tiny bit slower. > > 4) I have no need for rotating elements. > > That's unpersuasive since you're advocating adding a feature to list > that many others have no need for. > To be precise, I wasn't really advocating for a new feature but an internal optimization of a feature that already exists. > > Adding a word or two to a list is an O(1) addition to a data structure > > that takes O(N) memory to begin with. > > Yes, as mentioned, additive constants matter. > > > Another way of looking at it is that you would need to have 250 or so > > lists in memory at the same time before the extra pointer was even > > costing you kilobytes of memory. > > I've often run applications with millions of lists, maybe tens of > millions. Of course it would be 100's of millions if the machines were > big enough. > I bet even in your application, the amount of memory consumed by the PyListObjects themselves is greatly dwarfed by other objects, notably the list elements themselves, not to mention any dictionaries that your app uses. > > My consumer laptop has 3027908k of memory. > > I thought the idea of buying bigger machines was to solve bigger > problems, not to solve the same problems more wastefully. Well, I am not trying to solve problems wastefully here. CPU cycles are also scarce, so it seems wasteful to do an O(N) memmove that could be avoided by storing an extra pointer per list. I also think that encouraging the use of pop(0) would actually make many programs more memory efficient, in the sense that you can garbage collect list elements earlier. Thanks for your patience in responding to me, despite the needlessly abrasive tone of my earlier postings. I am coming around to this thinking: 1) Summarize all this discussion and my lessons learned in some kind of document. It does not have to be a PEP per se, but I could provide a useful service to the community by listing pros/cons/etc. 2) I would still advocate for removing the warning against list.pop (0) from the tutorial. I agree with Steven D'Aprano that docs really should avoid describing implementation details in many instances (although I do not know what he thinks about this particular case). I also think that the performance penalty for pop(0) is negligible for most medium-sized programs. For large-sized programs where you really want to swap in deque, I think most authors are beyond reading the tutorial and are looking elsewhere for insight on Python data structures. 3) I am gonna try to implement the patch anyway for my own edification. 4) I do think that there are ways that deque could be improved, but it is not high on my priority list. I will try to mention it in the PEP, though.
From: Steve Howell on 25 Jan 2010 18:30
On Jan 25, 1:32 pm, Arnaud Delobelle <arno...(a)googlemail.com> wrote: > Steve Howell <showel...(a)yahoo.com> writes: > > [...] > > > My algorithm does exactly N pops and roughly N list accesses, so I > > would be going from N*N + N to N + N log N if switched to blist. > > Can you post your algorithm? It would be interesting to have a concrete > use case to base this discussion on. > These are the profile results for an admittedly very large file (430,000 lines), which shows that pop() consumes more time than any other low level method. So pop() is not a total red herring. But I have to be honest and admit that I grossly overestimated the penalty for smaller files. Typical files are a couple hundred lines, and for that use case, pop()'s expense gets totally drowned out by regex handling. In other words, it's a lot cheaper to move a couple hundred pointers per list element pop than it is to apply a series of regexes to them, which shouldn't be surprising. ncalls tottime percall cumtime percall filename:lineno (function) 230001/1 149.508 0.001 222.432 222.432 /home/showell/workspace/ shpaml_website/shpaml.py:192(recurse) 429999 17.667 0.000 17.667 0.000 {method 'pop' of 'list' objects} 530000 8.428 0.000 14.125 0.000 /home/showell/workspace/ shpaml_website/shpaml.py:143(get_indented_block) 3700008 7.877 0.000 7.877 0.000 {built-in method match} 5410125/5410121 5.697 0.000 5.697 0.000 {len} 300000 3.938 0.000 22.286 0.000 /home/showell/workspace/ shpaml_website/shpaml.py:96(convert_line) 959999 3.847 0.000 6.759 0.000 /home/showell/workspace/ shpaml_website/shpaml.py:29(INDENT) 959999 3.717 0.000 12.547 0.000 /home/showell/workspace/ shpaml_website/shpaml.py:138(find_indentation) 370000 3.495 0.000 20.204 0.000 /home/showell/workspace/ shpaml_website/shpaml.py:109(apply_jquery) 370000 3.322 0.000 6.528 0.000 {built-in method sub} 1469999 2.575 0.000 2.575 0.000 {built-in method groups} As an aside, I am a little surprised by how often I call len() and that it also takes a large chunk of time, but that's my problem to fix. |