From: Jussi Piitulainen on 5 Mar 2010 09:43 Stefan Behnel writes: > Jussi Piitulainen, 04.03.2010 22:40: > > Stefan Behnel writes: > >> Jussi Piitulainen, 04.03.2010 11:46: > >>> I am observing weird semi-erratic behaviour that involves Python 3 > >>> and lxml, is extremely sensitive to changes in the input data, and > >>> only occurs when I name a partial result. I would like some help > >>> with this, please. (Python 3.1.1; GNU/Linux; how do I find lxml > >>> version?) > >> > >> Here's how to find the version: > >> > >> http://codespeak.net/lxml/FAQ.html#i-think-i-have-found-a-bug-in-lxml-what-should-i-do > > > > Ok, thank you. Here's the results: > > > > >>> print(et.LXML_VERSION, et.LIBXML_VERSION, > > ... et.LIBXML_COMPILED_VERSION, et.LIBXSLT_VERSION, > > ... et.LIBXSLT_COMPILED_VERSION) > > (2, 2, 4, 0) (2, 6, 26) (2, 6, 26) (1, 1, 17) (1, 1, 17) > > I can't reproduce this with the latest lxml trunk (and Py3.2 trunk) > and libxml2 2.7.6, even after running your test script for quite a > while. I'd try to upgrade the libxml2 version. Thank you much. I suppose that is good news. It's a big server with many users - I will ask the administrators to consider an upgrade when I get around to it. Turns out that lxml documentation warns not to use libxml2 version 2.6.27 if I want to use xpath, and that is just a bit newer than we have. On that cue, I seem to have found a workaround: I replaced the xpath expression with findall(titlef) where titlef = ( '//{http://www.openarchives.org/OAI/2.0/}record' '//{http://purl.org/dc/elements/1.1}title' ) In the previously broken naming() function I now have: result = etree.parse(BytesIO(body)) n = len(result.findall(titlef)) And in the previously working nesting() function: n = len(etree.parse(BytesIO(body)).findall(titlef)) With these changes, the test script gives consistently the result that I expect, and the more complicated real test script where I first met the problem also appears to work without a hitch. So, this works. The other, broken behaviour is totally scary, though.
First
|
Prev
|
Pages: 1 2 Prev: Don't work __getattr__ with __add__ Next: WANTED: A good name for the pair (args, kwargs) |