From: Steven D'Aprano on 31 Oct 2009 21:27 On Sat, 31 Oct 2009 16:27:20 +0000, kj wrote: >>1) it's a bad idea to name your own modules after modules in the stdlib > > Obviously, since it leads to the headaches this thread illustrates. But > there is nothing intrisically wrong with it. The fact that it is > problematic in Python is a design bug, plain and simple. There's no > rational basis for it, Incorrect. Simplicity of implementation and API is a virtue, in and of itself. The existing module machinery is quite simple to understand, use and maintain. Dealing with name clashes doesn't come for free. If you think it does, I encourage you to write a patch implementing the behaviour you would prefer. In addition, there are use-cases where the current behaviour is the correct behaviour. Here's one way to backport (say) functools to older versions of Python (untested): # === functools.py === import sys if sys.version >= '2.5': # Use the standard library version if it is available. old_path = sys.path[:] del sys.path[0] # Delete the current directory. from functools import * sys.path[:] = old_path # Restore the path. else: # Backport code you want. pass > and represents an unreasonable demand on module > writers, since contrary to the tight control on reserved Python > keywords, there does not seem to be a similar control on the names of > stdlib modules. What if, for example, in the future it was decided that > my_favorite_module name would become part of the standard library? This > alone would cause code to break. Not necessarily. Obviously your module my_favorite_module.py isn't calling the standard library version, because it didn't exist when you wrote it. Nor are any of your callers. Mere name clashes alone aren't necessarily an issue. One problem comes about when some module you import is modified to start using the standard library module, which conflicts with yours. Example: You have a collections module, which imports the standard library stat module. The Python standard library can safely grow a collections module, but what it can't do is grow a collections module *and* modify stat to use that. But in general, yes, you are correct -- there is a risk that future modules added to the standard library can clash with existing third party modules. This is one of the reasons why Python is conservative about adding to the std lib. In other words, yes, module naming conflicts is the Python version of DLL Hell. Python doesn't distinguish between "my modules" and "standard modules" and "third party modules" -- they're all just modules, there aren't three different implementations for importing a module and you don't have to learn three different commands to import them. But there is a downside too: if you write "import os" Python has no possible way of knowing whether you mean the standard os.py module or your own os.py module. Of course, Python does expose the import machinary to you. If avoiding standard library names is too much a trial for you, or if you are paranoid and want to future-proof your module against changes to the standard library (a waste of time in my opinion), you can use Python's import machinery to build your own system. -- Steven
From: Gabriel Genellina on 1 Nov 2009 00:38 En Sat, 31 Oct 2009 12:12:21 -0300, kj <no.email(a)please.post> escribi�: > I'm running into an ugly bug, which, IMHO, is really a bug in the > design of Python's module import scheme. The basic problem is that the "import scheme" was not designed in advance. It was a very simple thing at first. Then came packages. And then the __import__ builtin. And later some import hooks. And later support for zip files. And more import hooks and meta hooks. And namespace packages. And relative imports, absolute imports, and mixed imports. And now it's a mess. > Consider the following > directory structure: > [containing a re.py file in the same directory as the main script] > > If I now run the innocent-looking ham/spam.py, I get the following > error: > > % python26 ham/spam.py > Traceback (most recent call last): > [...] > File "/usr/local/python-2.6.1/lib/python2.6/string.py", line 116, in > __init__ > 'delim' : _re.escape(cls.delimiter), > AttributeError: 'module' object has no attribute 'escape' > My sin appears to be having the (empty) file ham/re.py. So Python > is confusing it with the re module of the standard library, and > using it when the inspect module tries to import re. Exactly; that's the root of your problem, and has been a problem ever since import existed. En Sat, 31 Oct 2009 13:27:20 -0300, kj <no.email(a)please.post> escribi�: >> 2) this has been fixed in Py3 > > In my post I illustrated that the failure occurs both with Python > 2.6 *and* Python 3.0. Did you have a particular version of Python > 3 in mind? If the `re` module had been previously loaded (the true one, from the standard library) then this bug is not apparent. This may happen if re is imported from site.py, sitecustomize.py, any .pth file, the PYTHONSTARTUP script, perhaps other sources... The same error happens if ham\spam.py contains the single line: import smtpd, and instead of re.py there is an empty asyncore.py file; that fails on 3.1 too. En Sat, 31 Oct 2009 22:27:09 -0300, Steven D'Aprano <steve(a)remove-this-cybersource.com.au> escribi�: > On Sat, 31 Oct 2009 16:27:20 +0000, kj wrote: > >>> 1) it's a bad idea to name your own modules after modules in the stdlib >> >> Obviously, since it leads to the headaches this thread illustrates. But >> there is nothing intrisically wrong with it. The fact that it is >> problematic in Python is a design bug, plain and simple. There's no >> rational basis for it, > > Incorrect. Simplicity of implementation and API is a virtue, in and of > itself. The existing module machinery is quite simple to understand, use > and maintain. Uhm... module objects might be quite simple to understand, but module handling is everything but simple! (simplicity of implem...? quite simple to WHAT? ROTFLOL!!! :) ) > Dealing with name clashes doesn't come for free. If you > think it does, I encourage you to write a patch implementing the > behaviour you would prefer. I'd say it is really a bug, and has existed for a long time. One way to avoid name clashes would be to put the entire standard library under a package; a program that wants the standard re module would write "import std.re" instead of "import re", or something similar. Every time the std package is suggested, the main argument against it is backwards compatibility. > In addition, there are use-cases where the current behaviour is the > correct behaviour. Here's one way to backport (say) functools to older > versions of Python (untested): You still would be able to backport or patch modules, even if the standard ones live in the "std" package. En Sat, 31 Oct 2009 12:12:21 -0300, kj <no.email(a)please.post> escribi�: > I've tried a lot of things to appease Python on this one, including > a liberal sprinkling of "from __future__ import absolute_import" > all over the place (except, of course, in inspect.py, which I don't > control), but to no avail. I think the only way is to make sure *your* modules always come *after* the standard ones in sys.path; try using this code right at the top of your main script: import sys, os.path if sys.argv[0]: script_path = os.path.dirname(os.path.abspath(sys.argv[0])) else: script_path = '' if script_path in sys.path: sys.path.remove(script_path) sys.path.append(script_path) (I'd want to put such code in sitecustomize.py, but sys.argv doesnt't exist yet at the time sitecustomize.py is executed) -- Gabriel Genellina
From: Steven D'Aprano on 1 Nov 2009 01:54 On Sun, 01 Nov 2009 01:38:16 -0300, Gabriel Genellina wrote: >> Incorrect. Simplicity of implementation and API is a virtue, in and of >> itself. The existing module machinery is quite simple to understand, >> use and maintain. > > Uhm... module objects might be quite simple to understand, but module > handling is everything but simple! (simplicity of implem...? quite > simple to WHAT? ROTFLOL!!! ) I stand corrected :) Nevertheless, the API is simple: the first time you "import name", Python searches a single namespace (the path) for a module called name. There are other variants of import, but the basics remain: search the path for the module called name, and do something with the first one you find. >> Dealing with name clashes doesn't come for free. If you think it does, >> I encourage you to write a patch implementing the behaviour you would >> prefer. > > I'd say it is really a bug, and has existed for a long time. Since import is advertised to return the first module with the given name it finds, I don't see it as a bug even if it doesn't do what the programmer intended it to do. If I do this: >>> len = 1 >>> def parrot(s): .... print len(s) .... >>> parrot("spam spam spam") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in parrot TypeError: 'int' object is not callable it isn't a bug in Python that I have misunderstood scopes and inadvertently shadowed a builtin. Shadowing a standard library module is no different. > One way to > avoid name clashes would be to put the entire standard library under a > package; a program that wants the standard re module would write "import > std.re" instead of "import re", or something similar. Every time the std > package is suggested, the main argument against it is backwards > compatibility. You could do it in a backwards compatible way, by adding the std package directory into the path. -- Steven
From: Gabriel Genellina on 1 Nov 2009 15:34 En Sun, 01 Nov 2009 02:54:15 -0300, Steven D'Aprano <steve(a)remove-this-cybersource.com.au> escribi�: > On Sun, 01 Nov 2009 01:38:16 -0300, Gabriel Genellina wrote: >>> Incorrect. Simplicity of implementation and API is a virtue, in and of >>> itself. The existing module machinery is quite simple to understand, >>> use and maintain. >> >> Uhm... module objects might be quite simple to understand, but module >> handling is everything but simple! (simplicity of implem...? quite >> simple to WHAT? ROTFLOL!!! ) > > I stand corrected :) > Nevertheless, the API is simple: the first time you "import name", Python > searches a single namespace (the path) for a module called name. There > are other variants of import, but the basics remain: > > search the path for the module called name, and do something with the > first one you find. Sure, beautiful, a plain and simple search over a list of directories. That's how it worked in Python 1.4, I think... Now you have lots of "hooks" and even "meta-hooks": sys.meta_path, sys.path_hooks, sys.path_importer_cache. And sys.path, of course, which may contain other things apart of directory names (zip files, eggs, and even instances of custom "loader" objects...). PEP 302 explains this but I'm not sure the description is still current. PEP369, if approved, would add even more hooks. Add packages to the picture, including relative imports and __path__[] processing, and it becomes increasingly harder to explain. Bret Cannon has rewritten the import system in pure Python (importlib) for 3.1; this should help to understand it, I hope. The whole system works, yes, but looks to me more like a collection of patches over patches than a coherent system. Perhaps this is due to the way it evolved. >>> Dealing with name clashes doesn't come for free. If you think it does, >>> I encourage you to write a patch implementing the behaviour you would >>> prefer. >> >> I'd say it is really a bug, and has existed for a long time. > > Since import is advertised to return the first module with the given name > it finds, I don't see it as a bug even if it doesn't do what the > programmer intended it to do. [...] Shadowing a standard library module > is no different. But that's what namespaces are for; if the standard library had its own namespace, such collisions would not occur. I can think of C++, Java, C#, all of them have some way of qualifying names. Python too - packages. But nobody came with a method to apply packages to the standard library in a backwards compatible way. Perhaps those name collisions are not considered serious. Perhaps every user module should live in packages and only the standard library has the privilege of using the global module namespace. Both C++ and XML got namespaces late in their life so in principle this should be possible. >> One way to >> avoid name clashes would be to put the entire standard library under a >> package; a program that wants the standard re module would write "import >> std.re" instead of "import re", or something similar. Every time the std >> package is suggested, the main argument against it is backwards >> compatibility. > > You could do it in a backwards compatible way, by adding the std package > directory into the path. Unfortunately you can't, at least not without some special treatment of the std package. One of the undocumented rules of the import system is that you must not have more than one way to refer to the same module (in this case, std.re and re). Suppose someone imports std.re; an entry in sys.modules with that name is created. Later someone imports re; as there is no entry in sys.modules with such name, the re module is imported again, resulting in two module instances, darkness, weeping and the gnashing of teeth :) (I'm sure you know the problem: it's the same as when someone imports the main script as a module, and gets a different module instance because the "original" is called __main__ instead). -- Gabriel Genellina
From: MRAB on 1 Nov 2009 17:01 Gabriel Genellina wrote: [snip] >>> One way to avoid name clashes would be to put the entire standard >>> library under a package; a program that wants the standard re >>> module would write "import std.re" instead of "import re", or >>> something similar. Every time the std package is suggested, the >>> main argument against it is backwards compatibility. >> >> You could do it in a backwards compatible way, by adding the std >> package directory into the path. > > Unfortunately you can't, at least not without some special treatment > of the std package. One of the undocumented rules of the import > system is that you must not have more than one way to refer to the > same module (in this case, std.re and re). Suppose someone imports > std.re; an entry in sys.modules with that name is created. Later > someone imports re; as there is no entry in sys.modules with such > name, the re module is imported again, resulting in two module > instances, darkness, weeping and the gnashing of teeth :) (I'm sure > you know the problem: it's the same as when someone imports the main > script as a module, and gets a different module instance because the > "original" is called __main__ instead). > Couldn't the entry in sys.modules be where the module was found, so that if 're' was found in 'std' then the entry is 'std.re' even if the import said just 're'?
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Problems with cx_Oracle and Oracle 11.1 on Windows Next: ANN: python-ldap-2.3.10 |