Prev: Project Explrer
Next: VB Editor in Excel
From: Schmidt on 1 Apr 2010 16:45 "Jeff Johnson" <i.get(a)enough.spam> schrieb im Newsbeitrag news:eqAW1Ma0KHA.260(a)TK2MSFTNGP05.phx.gbl... > Enjoy: > > SELECT GCount, Genre, main.Artist > FROM (SELECT Artist, Genre, COUNT(*) AS GCount FROM songs GROUP BY Artist, > Genre) AS main INNER JOIN (SELECT Artist, Max(GCount2) AS MaxGenre FROM > (SELECT Artist, Genre, COUNT(*) AS GCount2 FROM songs GROUP BY Artist, > Genre) AS sub2 GROUP BY Artist) AS sub1 ON sub1.Artist = main.Artist AND > main.GCount = sub1.MaxGenre > > This sort of thing is one of the most complex things you can do > in SQL since it goes against the set theory that is the very heart > of SQL. I recommend you break it apart into its component > pieces to see what's actually happening. I actually recommend > doing this sort of row-based filtering thing in code instead of > convoluted SQL. Actually this can be solved with less complexity by using one of the two not that wellknown Aggregate-Functions, built into the JET-SQL-language - meaning First() and Last()... This example is using Last(), but First() would also do, if one changes the Order By Clause of the following SubSelect-Definition (which is basically the same SQL as David has mentioned in its Opener-Posting). strSubSelect = "Select Count(*) As GCount, Genre, Artist From Songs " & _ "Group By Genre, Artist Order By GCount, Genre Desc" Ok what now remains is the construction of the final (SubSelect-wrapping) Statement, which does look pretty simple due to the usage of the Last() aggregate-functions: strSQL = "Select Last(GCount), Last(Genre), Artist From (" & _ SubSelect & _ ") Group By Artist" Only tested here with SQLite (which has a little bit different syntax with the First() and Last() aggregates), but the above is already "JET-adapted from my memory" and should give the correct results against an *.mdb too - would be nice if you (or David) could check it on your side. Olaf
From: Schmidt on 1 Apr 2010 20:14 "Schmidt" <sss(a)online.de> schrieb im Newsbeitrag news:%23m4Y5xd0KHA.3500(a)TK2MSFTNGP02.phx.gbl... > This example is using Last(), but First() would also do, if > one changes the Order By Clause of the following > SubSelect-Definition (which is basically the same SQL as > David has mentioned in its Opener-Posting). > > strSubSelect = "Select Count(*) As GCount, Genre, Artist From Songs " & _ > "Group By Genre, Artist Order By GCount, Genre Desc" > > Ok what now remains is the construction of the final > (SubSelect-wrapping) Statement, which does look pretty > simple due to the usage of the Last() aggregate-functions: > > strSQL = "Select Last(GCount), Last(Genre), Artist From (" & _ > SubSelect & _ > ") Group By Artist" Ok, before you guys "go nuts" (just tested with an *.mdb here) ... the approach *is* possible with JET, but only if the above SubSelect is transferred beforehand (per Select Into ...) into a temporary (but nonetheless physical) Table. Only then does JET use the correctly ordered Records from within the Last() or First() aggregates. A dynamic interpretation of the SubSelect, given as "mere string" (as in the above example) does not work correctly in conjunction with these aggregates, no matter how the SubSelect was ordered beforehand - so in this "dynamic case" Last() and First() do *not* respect the incoming sort-order of the records within the dynamic SubSelect; instead the physical representation - or better: the physical Insertion-Order within the underlying table (in this case the table [Songs]) for Feeding these aggregates within the context of an "artist-grouping" is used. And that (working with a temp-table, to make the above approach behave correctly) is probably not what David had in mind. Sorry for the confusion - seems this restriction to the Insertion-Order of records in physical tables, brings the usability of the First() and Last() aggregates in JET-SQL to near zero. But maybe this here can work in Davids special case with a somewhat better performance (I have to come up with something as short as the above now, do I? <g>): 'now all in one SQL-String (correct sorting is indirectly performed too) SQL = "Select 99999-Left(Min(GInfo),5) As GenreCount," & _ "Mid(Min(GInfo),6) As Genre," & _ "Artist " & _ "From (" & _ "Select 99999-Count(Genre) & Genre As GInfo," & _ "Artist From Songs Group By Artist,Genre" & _ ") Group By Artist" Ok, looks a bit "weird" - place it in the "fun-category" ... ;-) But as said, should work though - @David... would be interested in the performance-results (should run pretty well, even with all the String- and Math-Ops involved). Olaf
From: David Kaye on 1 Apr 2010 22:23 "Schmidt" <sss(a)online.de> wrote: >Ok, before you guys "go nuts" (just tested with an *.mdb here) ... >the approach *is* possible with JET, but only if the above >SubSelect is transferred beforehand (per Select Into ...) >into a temporary (but nonetheless physical) Table. >Only then does JET use the correctly ordered Records >from within the Last() or First() aggregates. Yeah, I tried it out and it didn't work for me. Again, though, the SQL statement I raved about works very well. Just for ducks I compacted the database, dropped the artist index and recreated it and the query now takes only about 0.8 seconds to run. The actual record count at this time is 37,425, so I'd say that's fairly fast given the complexity of the query.
From: Schmidt on 2 Apr 2010 11:42 "David Kaye" <sfdavidkaye2(a)yahoo.com> schrieb im Newsbeitrag news:hp3gu0$eme$1(a)news.eternal-september.org... [First(), Last() approach...] > Yeah, I tried it out and it didn't work for me. Again, > though, the SQL statement I raved about works very well. > Just for ducks I compacted the database, dropped the artist > index and recreated it and the query now takes only about 0.8 > seconds to run. The actual record count at this time is 37,425, > so I'd say that's fairly fast given the complexity of the query. Did you also tried my second attempt, at the end of my previous post, which avoids the First() and Last() stuff? BTW, there's also a new reply in the database-group now (from Jamal Samedov), which maybe also worth a try. Ok, your current 0.8 seconds based on Jeffs approach is already pretty good, something one can live with, but we all probably try to develop a better "performance- picture" over time (on what works good - and what works not that well with advanced SQL-queries) - and since my second query-approach is trying to trade "count of needed Sub-Queries" versus "more excessive Column-expressions"), it would be nice, if you could time that too - you are the man with the real-world-database, not that easy to create a large test-set which mimicks what you already have... :-) Olaf
From: David Kaye on 2 Apr 2010 14:43
"Schmidt" <sss(a)online.de> wrote: >Did you also tried my second attempt, at the end of my >previous post, which avoids the First() and Last() stuff? My version of Jet (3.6) doesn't seem to like it. I get a syntax error. Unfortunately, I don't have much time to pick it apart and see what's failing. >BTW, there's also a new reply in the database-group now >(from Jamal Samedov), which maybe also worth a try. No, his doesn't work. Again, I suggest people actually try the SQLs before posting them. His SQL returns the same artist multiple times, grouped by genre and not sorted by genre count. What I need is what Jeff (Jeff?) provided. I need a list of distinct (non-duplicated) list of artists. Next to each artist I need the genre they're "most famous" for. Returning the count of genres in the recordset isn't really necessary, but it serves as a check of whether the SQL is returning the proper records. In Jeff's case it is. In the example I cited, the following are a few records that come back: 7 Rap Andre Nickatina 62 Folk Ani DiFranco 89 Pop Annie Lennox Now, people may not agree with the genres (I certainly don't), but the data is gathered from many people, so it's sort of a consensus of what genres the people feel these artists represent. |