Prev: Python/HTML integration: phileas v0.3 released
Next: scanning under windows WIA with custom settings (dpi / etc )
From: Joshua Bronson on 20 Nov 2009 00:33 On Nov 19, 9:17 pm, Carl Banks <pavlovevide...(a)gmail.com> wrote: > Apart from the GPL what Ben said :) > it seems perfectly fine to release, and looks like > an interesting strategy. I've wanted one of those once in a while, > never enough to bother looking for one or writing one myself. glad to hear it! i'll release it to pypi if such feedback continues. > But you should absolutely not inherit from dict if you're overriding > all it's methods. It's useless and wasteful to do that, perhaps > dangerous. You end up using bytes for a small hash table that's never > used. > > Plus Python 3 has a notion of Abstract Base Classes: it will allow > customization of isinstance to advertise that your class implements > MutableMapping, which is the right way to do it. Actually that's what I was originally thinking of doing but didn't go through with it in my first pass out of concern that users might want isinstance(bijection(), dict) to be True. Now that you bring it up, I agree that it's the correct way to do it, and have reimplemented bijection as a MutableMapping (ABCs are actually in Python 2.6). Take a peek at the new and improved http://bitbucket.org/jab/toys/src/tip/bijection.py if you get a chance and let me know how it looks! Anyone have any other feedback? For instance, is offering the __call__ syntax for the inverse mapping wonderful or terrible, or maybe both? Thanks, Josh
From: Terry Reedy on 20 Nov 2009 15:09 Joshua Bronson wrote: > Anyone have any other feedback? For instance, is offering the __call__ > syntax for the inverse mapping wonderful or terrible, or maybe both? Terrible ;-) Use standard subscripting with slices, and only that, to both get and set. Let m[4] == m[4:] == 'abc' # m[4:] is suggested alternative addition Then m[:'abc'] == 4 m[4:] passes slice(4,None,None) to __getitem__ m[:'abc'] passes slice(None,'abc',None) It just happens that dict items and slices use the same notation, but they do, so take advantage of that. In fact, to emphasize the symmetry of the bijective map, consider disallowing m[key] as ambiguous and require m[key:], along with m[:key] to access and set. Note that m[slice(1,2,3):] passes slice(slice(1, 2, 3), None, None), so this approach does not even prevent using slices as keys/values. In __setitem__, m[a:b] which passes slice(a,b,None) would have to be an error. In __getitem__, it could either be a error or return True/False depending on whether the pair is in the map. But this depends on whether __contains__ only tests keys or is modified to test pairs. Terry Jan Reedy
From: Joshua Bronson on 20 Nov 2009 16:59 On Nov 20, 3:09 pm, Terry Reedy <tjreedy(a)udel.edu> wrote: > Joshua Bronson wrote: > > Anyone have any other feedback? For instance, is offering the __call__ > > syntax for the inverse mapping wonderful or terrible, or maybe both? > > Terrible ;-) > > Use standard subscripting with slices, and only that, to both get and set.. > > Let m[4] == m[4:] == 'abc' # m[4:] is suggested alternative addition > Then m[:'abc'] == 4 > > m[4:] passes slice(4,None,None) to __getitem__ > m[:'abc'] passes slice(None,'abc',None) > > It just happens that dict items and slices use the same notation, but > they do, so take advantage of that. In fact, to emphasize the symmetry > of the bijective map, consider disallowing m[key] as ambiguous and > require m[key:], along with m[:key] to access and set. > > Note that m[slice(1,2,3):] passes slice(slice(1, 2, 3), None, None), so > this approach does not even prevent using slices as keys/values. > > In __setitem__, m[a:b] which passes slice(a,b,None) would have to be an > error. In __getitem__, it could either be a error or return True/False > depending on whether the pair is in the map. But this depends on whether > __contains__ only tests keys or is modified to test pairs. > > Terry Jan Reedy absolutely genius. implemented in the latest version: http://bitbucket.org/jab/toys/src/tip/bijection.py thank you for the terrific idea!
From: Joshua Bronson on 21 Nov 2009 02:34 On Nov 20, 3:09 pm, Terry Reedy <tjreedy(a)udel.edu> wrote: > Use standard subscripting with slices, and only that, to both get and set.. i did this for __delitem__ too, so you can do e.g. del m[:'abc']. > In fact, to emphasize the symmetry of the bijective map, consider > disallowing m[key] as ambiguous and require m[key:] my initial feeling is that i'd prefer to be able to say m[key], the defense being that it's not ambiguous if it's documented, and users who don't like it don't have to use it, while users who do like it won't be alienated. but i am definitely still open to this. > Note that m[slice(1,2,3):] passes slice(slice(1, 2, 3), None, None), so > this approach does not even prevent using slices as keys/values. except slices aren't hashable.</nitpick> but props for even thinking of that! > In __setitem__, m[a:b] which passes slice(a,b,None) would have to be an > error. In __getitem__, it could either be a error or return True/False > depending on whether the pair is in the map. i went with raising an error for __getitem__(slice(a,b,None)). returning True/False for this usage based on whether a -> b is in the bijection is an interesting idea, but i feel like it complicates things for no good reason: if that's what you wanted to know you'd just ask whether m[a] == b. > But this depends on whether __contains__ only tests keys or is modified to test pairs. i have __contains__ only testing keys, and similarly [i for i in bijection] only gives you the keys, again on the theory that deviating too much from the dict api increases the learning (adoption) curve, so i think we should only do it if it buys us a tremendous usability win. thanks again for your insightful input, i'm pretty psyched about how this is coming along! any further feedback is always welcome. josh
From: Raymond Hettinger on 21 Nov 2009 21:22
On Nov 19, 3:24 pm, Joshua Bronson <jabron...(a)gmail.com> wrote: > I couldn't find a library providing a bijective map data structure > (allowing for constant-time lookups by value) in the few minutes I > looked, so I took a few more minutes to code one up:http://bitbucket.org/jab/toys/src/tip/bijection.py > > Is this at all worth releasing? Comments and suggestions welcome. > > Josh Hello Joshua, I have a few design ideas and comments for you. * The idea of using __call__ for looking-up inverse values was inspired. That is useable, clean, and easy to remember; however, as discussed below, there are issues though with its actual use in real code. * Am not excited by the inverse iterators. With just a regular mapping you can write: for a, b in m.items() ... # consider either a or b be the key and the other to be the value That meets all of the needs that would have been served by iter_inverse_keys() or iter_inverse_values() or whatnot. The mirrored API doesn't really provide much in the way of value added. * After exercising the API on a couple of samples, I'm worried that almost any inverse-method based API makes it difficult to review code while keeping straight the intended meaning of the forward and inverse relationships. Am thinking that it is simpler, faster, and clearer to just use two dictionaries -- that approach lets the variable names communicate the important info. For example, the following code helps keep the coder and code reviewer from conflating the forward and inverse directions: myurl = ip2url[myip] myip = url2ip[myurl] Contrast that with: myurl = site_bijection[myip] myip = site_bijection(myurl) With the latter, it is darned difficult to detect accidental conflation of brackets with parentheses. * So, I'm thinking that code needing a bijection would be better-off with two ordinary dicts, perhaps augmented by a couple of convenience functions: biject_add(site_bijection, ip=myip, url=myurl) # Create a new pairing, raise ValueError if either key # maps to more than one value (in violation of the # bijection invariant: one-to-one and onto) biject_remove(ip=myip) # Clear an entry from both dicts or biject_remove(url=myurl) Alternatively, another possible approach is to used use the class generation approach (such as that used by named_tuple()) to generate a custom bijection class with either attribute based or keyworded accessors: Attribute based accessors: site = Bijection('ip', 'url') site.url[myip] = myurl for ip, url in site.items() ... print site.ip[myurl] myurl = site.url.pop(myip) Keyword accessors: site = Bijection('ip', 'url') site.set(ip=myip, url=myurl) myurl = site.get(ip=myip) myip = set.get(url=myurl) myurl = site.pop(ip=myip) site.del(ip=myip) site.del(url=myurl) Hope these ideas help. The ultimate success of the Bijection code will depend on its clarity, simplicity, and speed. Experiment with various approaches to find-out which looks the best in real code. It cannot be error-prone or it is doomed. Also, it should not introduce much overhead processing or else people will avoid it. The API should be trivially simple so that people remember how to use it months after seeing it for the first time. Good luck and happy hunting, Raymond |