From: kj on 1 Feb 2010 21:34 I just spent about 1-1/2 hours tracking down a bug. An innocuous little script, let's call it buggy.py, only 10 lines long, and whose output should have been, at most two lines, was quickly dumping tens of megabytes of non-printable characters to my screen (aka gobbledygook), and in the process was messing up my terminal *royally*. Here's buggy.py: import sys import psycopg2 connection_params = "dbname='%s' user='%s' password='%s'" % tuple(sys.argv[1:]) conn = psycopg2.connect(connection_params) cur = conn.cursor() cur.execute('SELECT * FROM version;') print '\n'.join(x[-1] for x in cur.fetchall()) (Of course, buggy.py is pretty useless; I reduced the original, more useful, script to this to help me debug it.) Through a *lot* of trial an error I finally discovered that the root cause of the problem was the fact that, in the same directory as buggy.py, there is *another* innocuous little script, totally unrelated, whose name happens to be numbers.py. (This second script is one I wrote as part of a little Python tutorial I put together months ago, and is not much more of a script than hello_world.py; it's baby-steps for the absolute beginner. But apparently, it has a killer name! I had completely forgotten about it.) Both scripts live in a directory filled with *hundreds* little one-off scripts like the two of them. I'll call this directory myscripts in what follows. It turns out that buggy.py imports psycopg2, as you can see, and apparently psycopg2 (or something imported by psycopg2) tries to import some standard Python module called numbers; instead it ends up importing the innocent myscript/numbers.py, resulting in *absolute mayhem*. (This is no mere Python "wart"; this is a suppurating chancre, and the fact that it remains unfixed is a neverending source of puzzlement for me.) How can the average Python programmer guard against this sort of time-devouring bug in the future (while remaining a Python programmer)? The only solution I can think of is to avoid like the plague the basenames of all the 200 or so /usr/lib/pythonX.XX/xyz.py{,c} files, and *pray* that whatever name one chooses for one's script does not suddenly pop up in the appropriate /usr/lib/pythonX.XX directory of a future release. What else can one do? Let's see, one should put every script in its own directory, thereby containing the damage. Anything else? Any suggestion would be appreciated. TIA! ~k
From: Chris Rebert on 1 Feb 2010 21:57 On Mon, Feb 1, 2010 at 6:34 PM, kj <no.email(a)please.post> wrote: > I just spent about 1-1/2 hours tracking down a bug. <snip> > Through a *lot* of trial an error I finally discovered that the > root cause of the problem was the fact that, in the same directory > as buggy.py, there is *another* innocuous little script, totally > unrelated, whose name happens to be numbers.py. Â (This second script > is one I wrote as part of a little Python tutorial I put together > months ago, and is not much more of a script than hello_world.py; > it's baby-steps for the absolute beginner. Â But apparently, it has > a killer name! Â I had completely forgotten about it.) > > Both scripts live in a directory filled with *hundreds* little > one-off scripts like the two of them. Â I'll call this directory > myscripts in what follows. > > It turns out that buggy.py imports psycopg2, as you can see, and > apparently psycopg2 (or something imported by psycopg2) tries to > import some standard Python module called numbers; instead it ends > up importing the innocent myscript/numbers.py, resulting in *absolute > mayhem*. > > (This is no mere Python "wart"; this is a suppurating chancre, and > the fact that it remains unfixed is a neverending source of puzzlement > for me.) > > How can the average Python programmer guard against this sort of > time-devouring bug in the future (while remaining a Python programmer)? > The only solution I can think of is to avoid like the plague the > basenames of all the 200 or so /usr/lib/pythonX.XX/xyz.py{,c} files, > and *pray* that whatever name one chooses for one's script does > not suddenly pop up in the appropriate /usr/lib/pythonX.XX directory > of a future release. > > What else can one do? Â Let's see, one should put every script in its > own directory, thereby containing the damage. > > Anything else? > > Any suggestion would be appreciated. I think absolute imports avoid this problem: from __future__ import absolute_import For details, see PEP 328: http://www.python.org/dev/peps/pep-0328/ Cheers, Chris -- http://blog.rebertia.com
From: Roy Smith on 1 Feb 2010 22:15 In article <hk82uv$8kn$1(a)reader1.panix.com>, kj <no.email(a)please.post> wrote: > Through a *lot* of trial an error I finally discovered that the > root cause of the problem was the fact that, in the same directory > as buggy.py, there is *another* innocuous little script, totally > unrelated, whose name happens to be numbers.py. > [...] > It turns out that buggy.py imports psycopg2, as you can see, and > apparently psycopg2 (or something imported by psycopg2) tries to > import some standard Python module called numbers; instead it ends > up importing the innocent myscript/numbers.py, resulting in *absolute > mayhem*. I feel your pain, but this is not a Python problem, per-se. The general pattern is: 1) You have something which refers to a resource by name. 2) There is a sequence of places which are searched for this name. 3) The search finds the wrong one because another resource by the same name appears earlier in the search path. I've gotten bitten like this by shells finding the wrong executable (in $PATH). By dynamic loaders finding the wrong library (in $LD_LIBRARY_PATH). By C compilers finding the wrong #include file. And so on. This is just Python's import finding the wrong module in your $PYTHON_PATH. The solution is the same in all cases. You either have to refer to resources by some absolute name, or you need to make sure you set up your search paths correctly and know what's in them. In your case, one possible solution be to make sure "." (or "") isn't in sys.path (although that might cause other issues).
From: Steven D'Aprano on 1 Feb 2010 22:28 On Tue, 02 Feb 2010 02:34:07 +0000, kj wrote: > I just spent about 1-1/2 hours tracking down a bug. > > An innocuous little script, let's call it buggy.py, only 10 lines long, > and whose output should have been, at most two lines, was quickly > dumping tens of megabytes of non-printable characters to my screen (aka > gobbledygook), and in the process was messing up my terminal *royally*. > Here's buggy.py: [...] > It turns out that buggy.py imports psycopg2, as you can see, and > apparently psycopg2 (or something imported by psycopg2) tries to import > some standard Python module called numbers; instead it ends up importing > the innocent myscript/numbers.py, resulting in *absolute mayhem*. There is no module numbers in the standard library, at least not in 2.5. >>> import numbers Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: No module named numbers It must be specific to psycopg2. I would think this is a problem with psycopg2 -- it sounds like it should be written as a package, but instead is written as a bunch of loose modules. I could be wrong of course, but if it is just a collection of modules, I'd definitely call that a poor design decision, if not a bug. > (This is no mere Python "wart"; this is a suppurating chancre, and the > fact that it remains unfixed is a neverending source of puzzlement for > me.) No, it's a wart. There's no doubt it bites people occasionally, but I've been programming in Python for about ten years and I've never been bitten by this yet. I'm sure it will happen some day, but not yet. In this case, the severity of the bug (megabytes of binary crud to the screen) is not related to the cause of the bug (shadowing a module). As for fixing it, unfortunately it's not quite so simple to fix without breaking backwards-compatibility. The opportunity to do so for Python 3.0 was missed. Oh well, life goes on. > How can the average Python programmer guard against this sort of > time-devouring bug in the future (while remaining a Python programmer)? > The only solution I can think of is to avoid like the plague the > basenames of all the 200 or so /usr/lib/pythonX.XX/xyz.py{,c} files, and > *pray* that whatever name one chooses for one's script does not suddenly > pop up in the appropriate /usr/lib/pythonX.XX directory of a future > release. Unfortunately, Python makes no guarantee that there won't be some clash between modules. You can minimize the risks by using packages, e.g. given a package spam containing modules a, b, c, and d, if you refer to spam.a etc. then you can't clash with modules a, b, c, d, but only spam. So you've cut your risk profile from five potential clashes to only one. Also, generally most module clashes are far more obvious. If you do this: import module x = module.y and module is shadowed by something else, you're *much* more likely to get an AttributeError than megabytes of crud to the screen. I'm sorry that you got bitten so hard by this, but in practice it's uncommon, and relatively mild when it happens. > What else can one do? Let's see, one should put every script in its own > directory, thereby containing the damage. That's probably a bit extreme, but your situation: "Both scripts live in a directory filled with *hundreds* little one-off scripts like the two of them." is far too chaotic for my liking. You don't need to go to the extreme of a separate directory for each file, but you can certainly tidy things up a bit. For example, anything that's obsolete should be moved out of the way where it can't be accidentally executed or imported. -- Steven
From: Tim Chase on 1 Feb 2010 22:33
Stephen Hansen wrote: > First, I don't shadow built in modules. Its really not very hard to avoid. Given the comprehensive nature of the batteries-included in Python, it's not as hard to accidentally shadow a built-in, unknown to you, but yet that is imported by a module you are using. The classic that's stung me enough times (and many others on c.l.p and other forums, as a quick google evidences) such that I *finally* remember: bash$ touch email.py bash$ python ... >>> import smtplib Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/smtplib.py", line 46, in <module> import email.Utils ImportError: No module named Utils Using "email.py" is an innocuous name for a script/module you might want to do emailish things, and it's likely you'll use smtplib in the same code...and kablooie, things blow up even if your code doesn't reference or directly use the built-in email.py. Yes, as Chris mentions, PEP-328 absolute vs. relative imports should help ameliorate the problem, but it's not yet commonly used (unless you're using Py3, it's only at the request of a __future__ import in 2.5+). -tkc |