From: kevin.hall on
Modulo is a poor operation in helping find a random number. The reason
is that the lowest bits in a psuedo-random number are less random than
the upper bits. It's much better to use division:

In the example above, the range [0,3] contains 4 numbers. [0,9]
contains 10. 10/4 = 2 (integer division). 4*2 = 8 --> valid range for
not rerolling must have 8 numbers: [0,7]. This is the same as above.
But this time, rather than using modulo, integer divide by 2 (the
result of 10/4):

0 -> 0
1 -> 0
2 -> 1
3 -> 1
4 -> 2
5 -> 2
6 -> 3
7 -> 3
8 -> reroll
9 -> reroll


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Maxim Yegorushkin on

kevin.hall(a)motioneng.com wrote:
> Modulo is a poor operation in helping find a random number. The reason
> is that the lowest bits in a psuedo-random number are less random than
> the upper bits.

Are you talking about a specific implementation or does this behavior
apply to any implementation of rand()?


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: kanze on
kevin.hall(a)motioneng.com wrote:
> Modulo is a poor operation in helping find a random number.
> The reason is that the lowest bits in a psuedo-random number
> are less random than the upper bits.

If that's the case, then it isn't a very good random number
generator, right? Your statement is false for the random number
generators I use.

--
James Kanze GABI Software
Conseils en informatique orient?e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Carlos Moreno on
Ulrich Eckhardt wrote:

> Several things on aove code:
> - RAND_MAX not RAN_MAX
> - RAND_MAX+1 might overflow
> - rand()*100 might overflow, too
>
> The easiest way is to call rand() until it gives a number in your range,
> obviously, but this could mean many calls.
>
> A more sophisticated way is to first determine the largest multiple of your
> range that is smaller or equal to RAND_MAX, call rand() until the value is
> in that range and then simply apply the modulo operator to the result.

NO!!!!

Do *NOT* use modulo for this sort of things -- in the case you
describe, integer division works perfectly fine, and gives you
a perfectly uniform distribution (i.e., it introduces no
"disuniformities" in the distribution, with respect to the
base RNG -- if rand() is perfectly uniform, then drawing
numbers until you get one below the highest multiple of your
range, then dividing by the right number would give you a
perfectly uniform distribution in the given range)

>
> The reason is easy:
> Assume RAND_MAX=9, i.e. rand() gives you numbers from 0 to 9. If you need
> numbers in the range 0 to 3, the following mapping applies:
> 0 -> 0
> 1 -> 1
> 2 -> 2
> 3 -> 3
> 4 -> 0
> 5 -> 1
> 6 -> 2
> 7 -> 3
> 8 -> reroll
> 9 -> reroll

Good, except that it should be:

0 -> 0
1 -> 0
2 -> 1
3 -> 1
4 -> 2
5 -> 2
6 -> 3
7 -> 3
8 -> reroll
9 -> reroll

HTH,

Carlos
--

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Pete Becker on
kevin.hall(a)motioneng.com wrote:

> Modulo is a poor operation in helping find a random number. The reason
> is that the lowest bits in a psuedo-random number are less random than
> the upper bits.

That's a claim that's been made about certain types of random number
generators. It's not universally true, nor even as broadly true as many
people seem to believe.


> It's much better to use division:
>
> In the example above, the range [0,3] contains 4 numbers. [0,9]
> contains 10. 10/4 = 2 (integer division). 4*2 = 8 --> valid range for
> not rerolling must have 8 numbers: [0,7]. This is the same as above.

No, it addresses a different, real, problem. When you try to put ten
things into four slots, some slots will get more things than others.
Translation: if the size of your target range doesn't exactly divide the
size of your generator's range, you're going to get a less than perfect
distribution, regardless of the quality of your generator.

[good example snipped]

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]