From: esn on
Hi everyone,

This should be simple. I'm trying to pull one random record from each
group within a table. The groups are "Units" - geographic areas, and
I want to select one random point (conveniently enough, "Point") from
each Unit that satisfies certain criteria. Step 1 - query with a
random number field - Rnd([EventID]), sorted by the random number
field. EventID is an autonumber PK field. Then a query that performs
the grouping (by Unit) and pulls the first value of interest (in this
case Point) from the first record for each group. The problem is the
first query doesn't actually perform the sort correctly. Here are the
first few values in the random number field, which is set to sort in
ascending order:

RandomNumber
0.212475836277008
0.456852912902832
0.35159033536911
0.721272110939026
0.638044655323029

Clearly that's not ascending order - it doesn't appear to be any order
at all. Seems like the logic of using this setup to select a random
record is violated if the records don't actually get sorted
correctly. Every time I run the query I end up with a different
record on top, so it seems to be sorting "randomly" somehow but not by
the random number field. Any idea what's up? Is it recalculating the
random numbers after sorting the records or something?

PS - I know there are additional problems with trusting a last or
first function to do anything meaningful - that's the next hurdle but
at the moment I'd like to get the first step worked out. If anyone
has a good suggestion for returning the top 1 record within a group
(without using the first or last functions) that would help too.
From: esn on
Also, I just noticed Access redraws the random number every time I
click in one of the records of the query results, and seems to
struggle with recalculating (often tries to display the original and
new values at the same time in the same cell). Is this query just
recalculating and that's why the records never really appear in any
sort of order?
From: PieterLinden via AccessMonster.com on
esn wrote:
>Also, I just noticed Access redraws the random number every time I
>click in one of the records of the query results, and seems to
>struggle with recalculating (often tries to display the original and
>new values at the same time in the same cell). Is this query just
>recalculating and that's why the records never really appear in any
>sort of order?

Read this:
http://www.mvps.org/access/queries/qry0011.htm

--
Message posted via AccessMonster.com
http://www.accessmonster.com/Uwe/Forums.aspx/access-queries/201005/1

From: esn on
Thanks for the replies - I care how the record was chosen only in that
I need it to be random (or reasonably close to random). When I
checked to make sure that Access was functioning properly to select a
random record, there seemed to be a glitch, so I thought I would run
it by the experts. I figured this was just a recalculating issue and
that, at some point, the order of the records had been randomized, but
I wanted to be sure before I went too much further. And it's good to
know it's possible to build a function to stop Access from
recalculating the random field - given the crummy performance of
queries based on this one I might end up using that to speed things
up.

Now I have a question about the next step - here's the SQL I'm using
right now:

SELECT [GLSA Caps with Unit].Unit, [GLSA Caps with Unit].Point
FROM [GLSA Caps with Unit]
WHERE ((([GLSA Caps with Unit].Point) In
(SELECT TOP 3 [GLSA Caps with Unit_1].Point
FROM [GLSA Caps with Unit] AS [GLSA Caps with Unit_1]
WHERE ((([GLSA Caps with Unit_1].Unit)=[GLSA Caps with Unit].Unit))
ORDER BY Rnd([RndSeed]))))
ORDER BY [GLSA Caps with Unit].Unit;

And here's the output:

Unit Point
1 OO007
1 RR007
2 II006
2 LL001
2 LL005
2 MM001
2 MM002
3 II009
3 LL011
3 OO008
4 BB002
4 BB005
4 CC003
5 BB013
5 CC008
5 FF011
5 GG010
5 HH009
5 HH011
6 FF013
7 S002
7 U002
7 V003

Note the variable number of records per unit. FYI - Point is a text
field that identifies a geographic location (as I stated above) within
a grid based on a row identifier (a single or double letter) and a
column identifier (3 digits from 000 to 110). The source query (GLSA
Caps with Unit):

SELECT [Grid Point Info].Unit, [Trapping Data Records Table].Point,
Min([Trapping Data Records Table].[Capture/Event ID]) AS RndSeed
FROM [Grid Point Info] INNER JOIN [Trapping Data Records Table] ON
[Grid Point Info].LetterNumb = [Trapping Data Records Table].Point
WHERE ((([Trapping Data Records Table].[Species/Event])="GLSA"))
GROUP BY [Grid Point Info].Unit, [Trapping Data Records Table].Point;

To anticipate the first question - I already checked to make sure that
"GLSA Caps with Unit" returns at least three points per unit, and it
does. I've also tried using the "randomizer" custom function from the
link above, but I still get similar results. If I run the subquery on
it's own using "Unit=1" as criteria I get the right results (3 random
points in unit 1). So why does the query return less than three
points for units 1 and 6, and how can a subquery with a TOP 3 clause
be returning more than 3 points for some of the units?
From: Marshall Barton on
esn wrote:

>Thanks for the replies - I care how the record was chosen only in that
>I need it to be random (or reasonably close to random). When I
>checked to make sure that Access was functioning properly to select a
>random record, there seemed to be a glitch, so I thought I would run
>it by the experts. I figured this was just a recalculating issue and
>that, at some point, the order of the records had been randomized, but
>I wanted to be sure before I went too much further. And it's good to
>know it's possible to build a function to stop Access from
>recalculating the random field - given the crummy performance of
>queries based on this one I might end up using that to speed things
>up.
>
>Now I have a question about the next step - here's the SQL I'm using
>right now:
>
>SELECT [GLSA Caps with Unit].Unit, [GLSA Caps with Unit].Point
>FROM [GLSA Caps with Unit]
>WHERE ((([GLSA Caps with Unit].Point) In
> (SELECT TOP 3 [GLSA Caps with Unit_1].Point
> FROM [GLSA Caps with Unit] AS [GLSA Caps with Unit_1]
> WHERE ((([GLSA Caps with Unit_1].Unit)=[GLSA Caps with Unit].Unit))
> ORDER BY Rnd([RndSeed]))))
>ORDER BY [GLSA Caps with Unit].Unit;
>
>And here's the output:
>
>Unit Point
>1 OO007
>1 RR007
>2 II006
>2 LL001
>2 LL005
>2 MM001
>2 MM002
>3 II009
>3 LL011
>3 OO008
>4 BB002
>4 BB005
>4 CC003
>5 BB013
>5 CC008
>5 FF011
>5 GG010
>5 HH009
>5 HH011
>6 FF013
>7 S002
>7 U002
>7 V003
>
>Note the variable number of records per unit. FYI - Point is a text
>field that identifies a geographic location (as I stated above) within
>a grid based on a row identifier (a single or double letter) and a
>column identifier (3 digits from 000 to 110). The source query (GLSA
>Caps with Unit):
>
>SELECT [Grid Point Info].Unit, [Trapping Data Records Table].Point,
>Min([Trapping Data Records Table].[Capture/Event ID]) AS RndSeed
>FROM [Grid Point Info] INNER JOIN [Trapping Data Records Table] ON
>[Grid Point Info].LetterNumb = [Trapping Data Records Table].Point
>WHERE ((([Trapping Data Records Table].[Species/Event])="GLSA"))
>GROUP BY [Grid Point Info].Unit, [Trapping Data Records Table].Point;
>
>To anticipate the first question - I already checked to make sure that
>"GLSA Caps with Unit" returns at least three points per unit, and it
>does. I've also tried using the "randomizer" custom function from the
>link above, but I still get similar results. If I run the subquery on
>it's own using "Unit=1" as criteria I get the right results (3 random
>points in unit 1). So why does the query return less than three
>points for units 1 and 6, and how can a subquery with a TOP 3 clause
>be returning more than 3 points for some of the units?


Sorry, but I am having a seriously tough time unraveling
where the randon numbers are being recalculated. This is
especially compounded by the query optimizer doing whatever
it wants to combine your three queries into one with who
knows what effect on the random numbers.

I have not been able to explain the various number of
records, even when including the fact that TOP 3 will return
more than 3 records when there is a tie for the third value
in the sorted list.

--
Marsh
MVP [MS Access]