From: kevin.hall on 9 Feb 2006 04:55 Modulo is a poor operation in helping find a random number. The reason is that the lowest bits in a psuedo-random number are less random than the upper bits. It's much better to use division: In the example above, the range [0,3] contains 4 numbers. [0,9] contains 10. 10/4 = 2 (integer division). 4*2 = 8 --> valid range for not rerolling must have 8 numbers: [0,7]. This is the same as above. But this time, rather than using modulo, integer divide by 2 (the result of 10/4): 0 -> 0 1 -> 0 2 -> 1 3 -> 1 4 -> 2 5 -> 2 6 -> 3 7 -> 3 8 -> reroll 9 -> reroll [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Maxim Yegorushkin on 9 Feb 2006 09:44 kevin.hall(a)motioneng.com wrote: > Modulo is a poor operation in helping find a random number. The reason > is that the lowest bits in a psuedo-random number are less random than > the upper bits. Are you talking about a specific implementation or does this behavior apply to any implementation of rand()? [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: kanze on 9 Feb 2006 09:43 kevin.hall(a)motioneng.com wrote: > Modulo is a poor operation in helping find a random number. > The reason is that the lowest bits in a psuedo-random number > are less random than the upper bits. If that's the case, then it isn't a very good random number generator, right? Your statement is false for the random number generators I use. -- James Kanze GABI Software Conseils en informatique orient?e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Carlos Moreno on 9 Feb 2006 09:53 Ulrich Eckhardt wrote: > Several things on aove code: > - RAND_MAX not RAN_MAX > - RAND_MAX+1 might overflow > - rand()*100 might overflow, too > > The easiest way is to call rand() until it gives a number in your range, > obviously, but this could mean many calls. > > A more sophisticated way is to first determine the largest multiple of your > range that is smaller or equal to RAND_MAX, call rand() until the value is > in that range and then simply apply the modulo operator to the result. NO!!!! Do *NOT* use modulo for this sort of things -- in the case you describe, integer division works perfectly fine, and gives you a perfectly uniform distribution (i.e., it introduces no "disuniformities" in the distribution, with respect to the base RNG -- if rand() is perfectly uniform, then drawing numbers until you get one below the highest multiple of your range, then dividing by the right number would give you a perfectly uniform distribution in the given range) > > The reason is easy: > Assume RAND_MAX=9, i.e. rand() gives you numbers from 0 to 9. If you need > numbers in the range 0 to 3, the following mapping applies: > 0 -> 0 > 1 -> 1 > 2 -> 2 > 3 -> 3 > 4 -> 0 > 5 -> 1 > 6 -> 2 > 7 -> 3 > 8 -> reroll > 9 -> reroll Good, except that it should be: 0 -> 0 1 -> 0 2 -> 1 3 -> 1 4 -> 2 5 -> 2 6 -> 3 7 -> 3 8 -> reroll 9 -> reroll HTH, Carlos -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Pete Becker on 9 Feb 2006 09:53
kevin.hall(a)motioneng.com wrote: > Modulo is a poor operation in helping find a random number. The reason > is that the lowest bits in a psuedo-random number are less random than > the upper bits. That's a claim that's been made about certain types of random number generators. It's not universally true, nor even as broadly true as many people seem to believe. > It's much better to use division: > > In the example above, the range [0,3] contains 4 numbers. [0,9] > contains 10. 10/4 = 2 (integer division). 4*2 = 8 --> valid range for > not rerolling must have 8 numbers: [0,7]. This is the same as above. No, it addresses a different, real, problem. When you try to put ten things into four slots, some slots will get more things than others. Translation: if the size of your target range doesn't exactly divide the size of your generator's range, you're going to get a less than perfect distribution, regardless of the quality of your generator. [good example snipped] -- Pete Becker Dinkumware, Ltd. (http://www.dinkumware.com) [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |