From: kanze on 10 Feb 2006 09:33 kevin.hall(a)motioneng.com wrote: > What denotes a good random number generator? There are a number of tests. > No PSEUDO-random number generator is good compared to a TRUE > random number generator. It depends on what you are doing. For many applications, they are better. > But the C library rand() (which Ulrich Eckhardt and others in > this thread have recommended) is usually implemented as a > linear congruential PRNG. And these are notorious for having > less random lower order bits. The quality of a linear congruential PRNG depends on its parameters. Historically, the original Unix implementation used a very bad set, which resulted in a poor generator, period. (Not just the low order bits, although they were visibly poor.) Regretfully, this implementation ended up as the example in the C standard. But... This is a very poor generator, which fails just about every standard test. Avoiding using the low order bits may mask the problem, but the problem is still there. If your library implementation still uses this algorithm, the only real solution is to change it. The problem has been known for quite some time. I think (or hope) that most library implementations have since corrected it. > What PRNG do you use? Basically a linear congruential generator, based on parameters discussed in "Random Number Generators: Good Ones Are Hard To Find" (Park and Miller, CACM, Oct. 1988). To increase the period, I use two of them, with slightly different periods, merged using a shuffling algorithm described in Seciont 7.1 of <i>Numerical Recipes in C</i> (Press, Teukolscy, Vetterling and Flannery, 1992). > I use the mersenne-twister for anything that really matters in > my work. I've heard that they are very good. I've not had any reason to complain about my current implementation, but of course, I don't use it for anything really serious. > Though I believe the MT is less susceptible to having lower > order bits be less random, I haven't yet seen a study that > proves that. I believe that it passes all of the standard tests, which should be enough. > I would like to though -- seriously I would. If you can point > me to one or to a study that proves that the lower order bits > for the PRNGs that you use are not less random, please post a > link. The behavior of linear congruential generators is discussed in detail in the Park and Miller article cited above. For the Mersenne twister, I would look up the papers where it is first presented -- given its date, I presume that they document sufficient test results to convince people that they are good. FWIW, I usually use a very simple first test: I simulate throwing a pair of dice, using "rand()%6 + rand()%6 + 2". The relative probabilities for the results from 2 to 12 should be 1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1. Do ten or twenty trials of 1000 throws each. If any one number is systematically above or below the expected probability, it's suspicious. (If all numbers are always exactly the expected probability, it's also suspicious.) With the orignal Unix implementation of rand(), the probability of 2, 4, 6, 8, 10 or 12 in this test was 0 -- the values never came up. I've not done extensive tests of the rand() under Solaris today, but at least this simple test doesn't show up any obvious errors. -- James Kanze GABI Software Conseils en informatique orient?e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: kevin.hall on 10 Feb 2006 11:53 > Which C library is "the C library"? Ok, you're right here. What I should have said to be more clear is "Many implementations of the C library use 'simple' PNRGs that are notorious for this behavior. In particular, many older libraries use linear congruential generators which are known to display this behavior". > Maybe. I'm defending thinking instead of basing engineering decisions > on rumors. Look, linear congruential generators are known to have this problem. Modulo is thus known *in some circumstances* to be a poor choice. Division is known not to have problems *in all circumstances*. (This is what Carlos has pointed out too). I'm going to choose the operation that is will work all the time. Then I don't have to worry as much about how good the PNRG is for lower order bits. > No, because it's not the same as above. OK, again I didn't make myself clear. There's two parts of the problem: (1) reducing the range of values from the PNRG to reduce the imbalance in results. (2) Choosing an operation to map the many "passing" values of the PNRG to the target range required by the user. I agreed with Ulrich on (1). I was attempting to show a little more of the math behind it. What I disagree with Ulrich on is (2). Thus my list shows a different transformation. I hope that clears things up. [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: kanze on 10 Feb 2006 11:52 Francis Glassborow wrote: > In article <1139433651.899829.311980(a)g47g2000cwa.googlegroups.com>, > kevin.hall(a)motioneng.com writes > >Modulo is a poor operation in helping find a random number. > >The reason is that the lowest bits in a psuedo-random number > >are less random than the upper bits. It's much better to use > >division: > And that is completely irrelevant even if it were true (which > it generally isn't) unless you are dividing by a power of two. Not really. It is a known fact that a certain category of linear congruent generators have a very low period in the low order bits: the bit 0 has a period of 2, the bit 1 of 4, etc. And that the example generator in the C standard is one of these. This can be important any time you're taking a modulo which is a multiple of 2. If you use an expression like rand()%6 + rand()%6 + 2 to throw dice, you will never roll an even value. Of course, such generators have a lot of other problems as well. A bad RGN is a bad RGN. Using the high order bits will typically mean that a naive user won't seen the pattern as readily, but it doesn't mean the results are any better. The answer is to use a better generator. I think that most implementations have fixed this; at any rate, the Solaris generator doesn't suffer from this problem, and I seem to remember than the Glibc generator used with Linux doesn't either. Still, if you have an application where you must have a certain quality generator, you have to provide your own; the C++ standard doesn't give any guarantee. -- James Kanze GABI Software Conseils en informatique orient?e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: kanze on 10 Feb 2006 11:54 Ulrich Eckhardt wrote: > Carlos Moreno wrote: > [using modulo versus using integer division to cut down an RNG's range] > > As pointed out already, the lower bits, for certain classes > > of generators and certain implementations tend to have lower > > "randomness" than the higher bits. > [snipped further explanation] > > So, in some cases modulo is worse, and in no case division > > is worse -- why use modulo then? > Okay, now that's a reasoning I accept! > However, I'd like to point out something: you implemented a > workaround but omitted to document what exactly it was you > were working around. He also omitted any explination or proof that his work-around actually worked around the problem. In fact, he mistated the problem: Historically, certain implementations of rand() were not particularly random. This was immediately and visually apparent in the low order bits, which had a very short cycle. Although the high order bits were not really any better, statistically speaking, the rumor soon got around that they were, somehow, "more random". In fact, even the best linear congruent generators suffer from one defect: a very low number will always be followed by a high number, since if the number is low enough, the multiplication will not wrap. Thus, for example, with the minimum quality random generator described in Park and Miller ("Random Number Generators: Good Ones Are Hard To Find", CACM, Oct. 1988), any time the generator returns a value less than 44488, you are guaranteed that the next value is larger. Since it does this only once in 48271 times, the statistical bias will not be immediately visible, but in fact, you can say that the high order bits are "slightly" less random than the low order bits. (My own implementation merges two good linear congruent generators with slightly different periods, so this effect is not visible. But somehow I doubt that the typical implementation of rand() goes that far.) In practice, I would expect that this is true of most modern implementations of rand(). The weaknesses of the older generators are obvious, and Park and Miller gave a simple and effective solution. -- James Kanze GABI Software Conseils en informatique orient?e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Pete Becker on 10 Feb 2006 11:54
Alberto Ganesh Barbati wrote: > quickcur(a)yahoo.com ha scritto: > >>Hi, I am using the int version of rand() with MS Visual Studio. I would >>like to generate random number in the range of [0, 99]. Right now I am >>using >> >>rand() * 100 / (RAN_MAX + 1) >> >>I found that I got too much 0. 0 appears more than any other numbers. >>How can I fix it? > > > I've seen a lot of discussion about rand(), modulo and division in this > thread... I'm not going to delve into that, but rather I suggest you > throw away rand() as soon as possible and consider using Boost.Random > (http://boost.org/libs/random/) as a replacement. Replacing something simple with something complex might be a good solution if the replacement solves a real problem. At this point there's nothing that indicates that such a drastic change is needed. rand() is part of the standard library, so it's already there. No extra work needed. > It's a solid > peer-reviewed random generator library, which means all gory details > have probably been settled, so that you don't need to bother about them. > > Boost.Random has been considered as a reference implementation for the > upcoming TR1, so I bet it's going to become quite popular (at least I > hope so). There is no reference implementation for TR1, nor was there any consideration of such a thing. -- Pete Becker Dinkumware, Ltd. (http://www.dinkumware.com) [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |