From: Ethan Furman on 27 May 2010 15:45 Let's say I have two tables: CatLovers DogLovers ------------------- ------------------- | name | age | | name | age | |-----------------| |-----------------| | Allen | 42 | | Alexis | 7 | | Jerod | 29 | | Michael | 21 | | Samuel | 17 | | Samuel | 17 | | Nickalaus | 55 | | Lawrence | 63 | | Frederick | 34 | | Frederick | 34 | ------------------- ------------------- NumberOfPets --------------------------- | name | cats | dogs | --------------------------- | Allen | 2 | 0 | | Alexis | 0 | 3 | | Michael | 0 | 1 | | Samuel | 1 | 2 | | Jerod | 3 | 0 | | Nickalaus | 5 | 0 | | Lawrence | 0 | 1 | | Frederick | 3 | 2 | --------------------------- (I know, I know -- coming up with examples has never been my strong point. ;) catlovers = dbf.Table('CatLovers') doglovers = dbf.Table('DogLovers') petcount = dbf.Table('NumberOfPets') For the sake of this highly contrived example, let's say I'm printing a report that I would like in alphabetical order of those who love both cats and dogs... def names(record): return record.name c_idx = catlovers.create_index(key=names) d_idx = doglovers.create_index(key=names) p_idx = petcount.create_index(key=names) # method 1 for record in c_idx: if record in d_idx: print record.name, record.age, \ p_idx[record].cats, p_idx[record].dogs *or* # method 2 for record in c_idx: if d_idx.key(record) in d_idx: # or if names(record) in d_idx: print record.name, record.age \ p_idx[record].cats, p_idx[record].dogs Which is better (referring to the _in_ statement)? Part of the issue revolves around the question of is _any_ record in the CatLovers table really in the DogLovers index? Obviously no -- so if you are asking the question in code you are really asking if a record from CatLovers has a matching key value in DogLovers, which means either the __contains__ code can apply the key function to the record (implicit, as in method 1 above) or the calling code can do it (explicit, as in method 2 above). I'm leaning towards method 1, even though the key function is then called behind the scenes, because I think it makes the calling code cleaner. Opinions? ~Ethan~
From: D'Arcy J.M. Cain on 27 May 2010 16:04 On Thu, 27 May 2010 12:45:58 -0700 Ethan Furman <ethan(a)stoneleaf.us> wrote: > Let's say I have two tables: > > CatLovers DogLovers > ------------------- ------------------- > | name | age | | name | age | > |-----------------| |-----------------| [...] > > NumberOfPets > --------------------------- > | name | cats | dogs | > --------------------------- [...] First problem is learning to count. :-) Second (first real) problem is that you database is not normalized. If all of the cat lovers and dog lovers are in NumberOfPets then move the age into that. Probably should rename it as well. Finally, are these SQL databases? The best way of getting information is with SQL. SELECT * FROM NumberOfPets WHERE name IN (SELECT name FROM CatLovers) OR name IN (SELECT name FROM DogLovers) ORDER BY name; > catlovers = dbf.Table('CatLovers') > doglovers = dbf.Table('DogLovers') > petcount = dbf.Table('NumberOfPets') I guess you should tell us what dbf is. It doesn't seem to be a standard module and it doesn't look like DB-API. It's hard to answer your question without knowing what these functions do. -- D'Arcy J.M. Cain <darcy(a)druid.net> | Democracy is three wolves http://www.druid.net/darcy/ | and a sheep voting on +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
From: Ethan Furman on 27 May 2010 16:06 Ethan Furman wrote: > Let's say I have two tables: Okay, let's say I have three tables instead. ;p
From: Ethan Furman on 27 May 2010 16:33 D'Arcy J.M. Cain wrote: > On Thu, 27 May 2010 12:45:58 -0700 > Ethan Furman <ethan(a)stoneleaf.us> wrote: >> Let's say I have two tables: >> >> CatLovers DogLovers >> ------------------- ------------------- >> | name | age | | name | age | >> |-----------------| |-----------------| > [...] >> NumberOfPets >> --------------------------- >> | name | cats | dogs | >> --------------------------- > [...] > > First problem is learning to count. :-) Heh -- like I said, I'm terrible at making examples. Originally it was two tables, then I threw in a third so I could show using an index as a dictionary. > Second (first real) problem is that you database is not normalized. > If all of the cat lovers and dog lovers are in NumberOfPets then move > the age into that. Probably should rename it as well. The actual tables that I would be using are different versions of mailing tables, with one table not having as many records as the other, and wanting to do something with the records that are in common between the two. > Finally, are these SQL databases? The best way of getting information > is with SQL. The format of these tables is either dBase III or Visual FoxPro 6, and dbf is a module I have coded (still working on SQL support for it... getting closer!). A slightly out-of-date version of it is available on PyPI. My question centers around the __contains__ method of the Index object that I am writing, and whether it should call the key function of the index when passed a record, or if I should make the calling code do that. Hope this helps. ~Ethan~
From: Christian Heimes on 27 May 2010 16:27 > Finally, are these SQL databases? The best way of getting information > is with SQL. > > SELECT * FROM NumberOfPets > WHERE name IN (SELECT name FROM CatLovers) OR > name IN (SELECT name FROM DogLovers) > ORDER BY name; A good way is to use SQL with JOINs instead of horrible nested selects. Although SQL is declarative, you shouldn't make the work of the query optimizer so hard. Christian
|
Next
|
Last
Pages: 1 2 Prev: matplotlib: show xticks only for discrete times Next: Tryton 1.6 series is out |