Prev: Query null value problem
Next: Lag 1 day relationship
From: esn on 13 May 2010 14:39 I am analyzing wildlife research data and need to find the record that represents the first time an individual animal was detected in a given year. I thought I had a scheme that should work, but it's failing. First I wrote a query that returns the year, individualID, date, visit (am or pm), and recordID (PK). The query is sorted sequentially by the first four fields in ascending order. It returns exactly what I had hoped - say individual #9999 was detected 5 times in 2009, those records are in correct chronological order, regardless of the recordID. I used this query as the source for a second query, which groups by year and individualID, and selects first([recordID]). To follow the example above, if the data was entered out of order, the first time (chronologically) that individual #9999 was detected could have a recordID of 5000, while a subsequent detection could have a recordID of 4000. For some reason, even though the first query places everything in chronological order, the first([recordID]) field will return 4000, instead of returning 5000 as I want it to. Any ideas? It's as if the order of the original query is lost when plugged into the second query. I tried adding date and visit to the ORDER BY clause of the second query, but I can't have them in the order by clause without performing an aggregate function on them, and if I stick them in the query with a function I still get the wrong recordID. I also tried removing the reference to Query1 and replacing it with the actual SQL from the first query - still I get the wrond recordID. Here's the query: SELECT Year, IndividualID, First(ID) AS [First Cap ID] FROM (SELECT Year([Date]) AS [Year], [Data].IndividualID, [Data].Date, [Data].Visit, [Data].ID FROM [Data] WHERE ((([Data].IndividualID) Is Not Null)) ORDER BY Year([Date]), [Data].IndividualID, [Data].Date, [Data].Visit) GROUP BY Year, IndividualID ORDER BY Year, IndividualID;
From: ghetto_banjo on 13 May 2010 15:48 Do not use the aggregate First function. It is not a reliable function to use, and does not really do what you think does. Instead, try the aggregate Min function. You can use the Min function on the Date field you have to return the first date for a particular an animal was detected. Side note, you want to consider renaming that field, "Date" is a reserved word for Access and can cause problems using it as a field name.
From: Marshall Barton on 13 May 2010 16:00 esn wrote: >I am analyzing wildlife research data and need to find the record that >represents the first time an individual animal was detected in a given >year. I thought I had a scheme that should work, but it's failing. >First I wrote a query that returns the year, individualID, date, visit >(am or pm), and recordID (PK). The query is sorted sequentially by >the first four fields in ascending order. It returns exactly what I >had hoped - say individual #9999 was detected 5 times in 2009, those >records are in correct chronological order, regardless of the >recordID. > >I used this query as the source for a second query, which groups by >year and individualID, and selects first([recordID]). To follow the >example above, if the data was entered out of order, the first time >(chronologically) that individual #9999 was detected could have a >recordID of 5000, while a subsequent detection could have a recordID >of 4000. For some reason, even though the first query places >everything in chronological order, the first([recordID]) field will >return 4000, instead of returning 5000 as I want it to. Any ideas? >It's as if the order of the original query is lost when plugged into >the second query. > >I tried adding date and visit to the ORDER BY clause of the second >query, but I can't have them in the order by clause without performing >an aggregate function on them, and if I stick them in the query with a >function I still get the wrong recordID. I also tried removing the >reference to Query1 and replacing it with the actual SQL from the >first query - still I get the wrond recordID. Here's the query: > >SELECT Year, IndividualID, First(ID) AS [First Cap ID] >FROM (SELECT Year([Date]) AS [Year], [Data].IndividualID, [Data].Date, >[Data].Visit, [Data].ID >FROM [Data] >WHERE ((([Data].IndividualID) Is Not Null)) >ORDER BY Year([Date]), [Data].IndividualID, [Data].Date, [Data].Visit) >GROUP BY Year, IndividualID >ORDER BY Year, IndividualID; The First function is mostly useless because it uses the first value retrieved from disk, which has nothing to do with the smallest value or the order records were created. Normally, you should use the Min function instead of First, but if the ID field is an AutoNumber, there is no guarantee the AutoNumbers are created in a monotonically increasing order. That means that you should find the minimum date for each ID and use that to find the desired record. A relatively simple way of doing that is something like: SELECT Year, IndividualID, [Data].Date As [First Cap ID], [Data].Visit, [Data].ID FROM [Data] WHERE [Data].IndividualID) Is Not Null And [Data].Date = (SELECT Min(X.Date) FROM [Data] As X WHERE Year(X,Date) = Year([Data].Date) And X,IndividualID = [Data].IndividualID ) ORDER BY Year, IndividualID; -- Marsh MVP [MS Access]
From: esn on 13 May 2010 16:19 Min([Date]) doesn't work because there can be more than one detection on a particular date. Also I'd rather return the ID field than return a date field, so that I can use those ID values in joins or criteria (using the In() function) later on to work with only records that represent a "first capture". I've tried going through this using Min([Date]) before and it requires a ton of extra steps (checking if the individual was detected more than once on that date, creating temporary tables to return updateable recordsets, etc), so I would much rather use the method I'm describing if there's any way to get it to work. Here is an example straight from the query results (Query2 is exactly as posted above, Query1 is the simpler SQL statement nested into the FROM clause of Query2): Query1 results - detections sorted chronologically: IndividualID Date Visit ID TAAM664 8/10/2009 AM 11893 TAAM664 8/10/2009 PM 11891 TAAM664 8/11/2009 AM 11892 Query2 results - First Cap ID should be the first ID listed above, but is in fact Min([ID]) IndividualID First Cap ID TAAM664 11891 According to my understanding of the first function (based on having used it before and everything in the help files), it should be returning 11893 from the results above. But for some reason, it seems to be sorting the results from Query1 by ID before processing them in Query2 and reordering them according to Query2's order by and group by statements. If what I'm attempting is not what the first function is meant to do, then is there another way anyone can think of to do it simply, without the drawbacks of selecting the minimum date and then joining that back to the table (which can return more than one record and returns a non- updateable recordset)?
From: vanderghast on 13 May 2010 17:06
Sure, it would help if you were having a full date AND time field, rather than a date and an AM/PM field. Two queries is probably the easiest way to go: SELECT IndividualID, MIN( [date] + iif( [AM/PM] = "PM", #12:00:00#, 0) ) AS DateAndPseudoTime FROM tableName GROUP BY IndividualID to be saved, as, say, q1, then SELECT a.* FROM tableName AS a INNER JOIN q1 ON a.IndividualID=q1.IndividualID WHERE a.[date]+ iif( [AM/PM] ="PM", #12:00:00#, 0 ) = q1.DateAndPseudoTime should return the desired result. Again, it would be easier with full date AND TIME field, rather than two different fields which I assume were [date] and [AM/PM]) For completeness, FIRST and LAST can be very useful if you wish to get data from the same record: SELECT f1, LAST(f2), LAST(f3) FROM somewhere GROUP BY f1 given that somewhere is: f1 f2 f3 1 10 20 1 20 10 1 15 30 1 17 5 could return any of the four record (depends on the execution plan), while SELECT f1, MIN(f2), MIN(f3) FROM somewhere GROUP BY f1 would return 1 10 5 where values come from different records! Same with using MAX(f2), MAX(f3). But definitively, First (and Last) do not mean earliest (latest), neither TOP 1. Vanderghast, Access MVP |