From: Ilmari Karonen on 31 Jul 2010 10:25 ["Followup-To:" header set to sci.math.] On 2010-07-30, Ron Shepard <ron-shepard(a)NOSPAM.comcast.net> wrote: > In article <i2tvn7$ldf$1(a)smaug.linux.pwf.cam.ac.uk>, nmm1(a)cam.ac.uk > wrote: >> In article <ron-shepard-969C9C.22130629072010(a)news60.forteinc.com>, >> Ron Shepard <ron-shepard(a)NOSPAM.comcast.net> wrote: >> > >> >The situation that needs to be avoided is to run the code once with one >> >seed, and then run it again with another seed that results in an overlap >> >of the sequences of values for the two runs. In some applications this >> >is unimportant, but in other applications it would be bad for two >> >supposedly independent runs to have a long sequence of identical values. >> >> No, no, a thousand times, NO! That is NOT enough, though FAR too many >> Web pages, published papers and books claim that it is. Disjointness >> isn't even a poor relation of randomness. > [snip] > >> Parallel random number generation is not easy, and 99% of the stuff >> published on it is somewhere between grossly misleading and complete >> nonsense. > > I think in the parallel case, one would want to be able to generate > a seed to produce values that are guaranteed not to overlap with any > other node. Maybe something like > > call RANDOM_NEW_SEED( old_seed, n, new_seed, my_node ) > > would be a sufficient interface. new_seed(:) would depend on > my_node in such a way that the generated sequence would not overlap > with that produced by any other possible value of my_node (again, > assuming the cycle length is long enough to satisfy that request). Let me try to demonstrate, using a simplified (and admittedly somewhat artificial) counterexample, why lack of overlap is not a sufficient condition for independence: Assume we have a "random oracle" R: Z -> [0,1) which takes as its input an integer and returns a uniformly distributed random real number between 0 and 1, such that the same input always produces the same output, but that the outputs for different inputs are completely independent. Given such an oracle, we can construct a perfect PRNG P: Z -> [0,1)^N which takes as its input an integer k, and returns the sequence <R(k), R(k+1), R(k+2), ...>. Obviously, the sequence generated by P would be indistinguishable from random (since it is indeed, by definition, random) and non-repeating. Now, what if we wanted several independent streams of random numbers? Obviously, we can't just pass different seed values to P, since we know that the streams would eventually overlap. We could solve the problem by modifying P, e.g. to make it return the sequence <R(f(k,0)), R(f(k,1)), R(f(k,2)), ...> instead, where f is an injective function from Z^2 to Z. But what if we wanted to do it _without_ modifying P? One _bad_ solution would be to define a new generator Q: Z -> [0,1)^N as Q(k)_i = frac(P(0)_i + ak), where a is some irrational number and frac: R -> [0,1) returns the part of a real number "after the decimal point" (i.e. frac(x) = x - floor(x)). Clearly, the sequences returned by Q for different seed values would never overlap, and, individually, they would each still be perfectly random. Yet, just as clearly, all those sequences would be related by a very trivial linear relation that would be blatantly obvious if you, say, plotted two of them against each other. The _right_ solution, as I suggested above, would've been to redesign the underlying generator so that the different streams would be not just non-overlapping but actually statistically independent. The same general advice holds for practical PRNGs too, not just for my idealized example. You can't just take any arbitrary PRNG, designed for generating a _single_ stream of random-seeming numbers, and expect the output to still look random if you get to compare several distinct output streams generated from related seeds. It might turn out that it does look so, if the designer of the PRNG was careful or just lucky. But designing a PRNG to satisfy such a requirement is a strictly harder problem than simply designing it to generate a single random-looking stream. -- Ilmari Karonen To reply by e-mail, please replace ".invalid" with ".net" in address.
From: Uno on 31 Jul 2010 17:54 Ilmari Karonen wrote: > ["Followup-To:" header set to sci.math.] > On 2010-07-30, Ron Shepard <ron-shepard(a)NOSPAM.comcast.net> wrote: >> In article <i2tvn7$ldf$1(a)smaug.linux.pwf.cam.ac.uk>, nmm1(a)cam.ac.uk >> wrote: >>> Parallel random number generation is not easy, and 99% of the stuff >>> published on it is somewhere between grossly misleading and complete >>> nonsense. >> I think in the parallel case, one would want to be able to generate >> a seed to produce values that are guaranteed not to overlap with any >> other node. Maybe something like >> >> call RANDOM_NEW_SEED( old_seed, n, new_seed, my_node ) >> >> would be a sufficient interface. new_seed(:) would depend on >> my_node in such a way that the generated sequence would not overlap >> with that produced by any other possible value of my_node (again, >> assuming the cycle length is long enough to satisfy that request). > > Let me try to demonstrate, using a simplified (and admittedly somewhat > artificial) counterexample, why lack of overlap is not a sufficient > condition for independence: > > Assume we have a "random oracle" R: Z -> [0,1) which takes as its > input an integer and returns a uniformly distributed random real > number between 0 and 1, such that the same input always produces the > same output, but that the outputs for different inputs are completely > independent. > > Given such an oracle, we can construct a perfect PRNG P: Z -> [0,1)^N > which takes as its input an integer k, and returns the sequence <R(k), > R(k+1), R(k+2), ...>. Obviously, the sequence generated by P would be > indistinguishable from random (since it is indeed, by definition, > random) and non-repeating. > > Now, what if we wanted several independent streams of random numbers? > Obviously, we can't just pass different seed values to P, since we > know that the streams would eventually overlap. We could solve the > problem by modifying P, e.g. to make it return the sequence > <R(f(k,0)), R(f(k,1)), R(f(k,2)), ...> instead, where f is an > injective function from Z^2 to Z. But what if we wanted to do it > _without_ modifying P? > > One _bad_ solution would be to define a new generator Q: Z -> [0,1)^N > as Q(k)_i = frac(P(0)_i + ak), where a is some irrational number and > frac: R -> [0,1) returns the part of a real number "after the decimal > point" (i.e. frac(x) = x - floor(x)). > > Clearly, the sequences returned by Q for different seed values would > never overlap, and, individually, they would each still be perfectly > random. Yet, just as clearly, all those sequences would be related by > a very trivial linear relation that would be blatantly obvious if you, > say, plotted two of them against each other. > > The _right_ solution, as I suggested above, would've been to redesign > the underlying generator so that the different streams would be not > just non-overlapping but actually statistically independent. The same > general advice holds for practical PRNGs too, not just for my > idealized example. > > You can't just take any arbitrary PRNG, designed for generating a > _single_ stream of random-seeming numbers, and expect the output to > still look random if you get to compare several distinct output > streams generated from related seeds. It might turn out that it does > look so, if the designer of the PRNG was careful or just lucky. But > designing a PRNG to satisfy such a requirement is a strictly harder > problem than simply designing it to generate a single random-looking > stream. > Now that's an interesting read. Ron is a person who makes "mistakes" in contemporary scientific programming, but he always moves the scrum forward in a useful way. I notice that follow-ups are set to sci.math, which I've never really read much. When I have needed mathematical help on usenet, I've turned to the germans in de.sci.mathematik, who have a very well-stylized notation. I'm used to seeing the notation of functions from the point of abstract algebra (in particular the Gallian text, which follows me everywhere). But there's been a lot of water under the bridge (and one concussion) since I had the leisure of reading college textbooks. So my question for you is going to be more of a request, namely, that you say a few more words about the mappings, what sets are involved, and what things like Z^2 mean exactly. To reveal my naivete I would think a perfect prng would look like: P: Z^N -> [0,1)^N By the way, nice job to get the interval [0,1) correct. That's a fortranism. Cheers, -- Uno
From: sturlamolden on 31 Jul 2010 20:31 On 31 Jul, 16:25, Ilmari Karonen <usen...(a)vyznev.invalid> wrote: > The _right_ solution, as I suggested above, would've been to redesign > the underlying generator so that the different streams would be not > just non-overlapping but actually statistically independent. Yes. For the Mersenne Twister it has been suggested to use different characterisic polynomials that are "relatively prime" to each other. Another suggestion is to derive a fast "jump ahead" algorithm allowing parallel work share with the same polynomial. Just picking different seeds and praying they will not overlap/interact is risky. http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/jump-seta-lfsr.pdf The other option is to observe that random numbers can usually be produced much faster than they are consumed. That is because the simple integer manipulation in the PRNG is much less expensive than the complicated floating point maths in numerical algorithms. If we can only consume a fraction of the 138 million random numbers per second from KISS4691, we don't need a parallel PRNG. We can let one processor fill N (spinlock synchronized) queues (fifos) with pseudorandom integers from one KISS4691 sequence. The queues will serve as sources of random bits for a group of N relatively/mutually independent PRNGs. Amortizing the synchronization overhead comes down to buffering inside the prng frontend harvesting the queues. E.g. with an internal buffer of 256 integers, one only has to aquire the spinlock and harvest a queue on every 256 call to the prng. The best buffer size is dependent on the hardware (e.g. hierarchical memory). Here we could note that "Keep it simple stupid" is fast compared to Mersenne Twister, which makes it more suitable for this kind of parallel algorithms.
From: Ron Shepard on 1 Aug 2010 21:19 In article <e5f36412-ddf2-4ed5-9014-9d23e2c0da88(a)u4g2000prn.googlegroups.com>, orz <cdhorz(a)gmail.com> wrote: > If an RNG has a large number of possible states (say, 2^192) then it's > unlikely for any two runs to EVER reach identical states by any means > other than starting from the same seed. This RNG has a period vastly > in excess of "large", so for practical purposes it just won't > happen. If you think about it for a second, you can see that this is not true for a PRNG with the standard fortran interface. This interface allows the seed values to be both input and output. The purpose (presumably, although this is not required by the standard) for this is to allow the seed to be extracted at the end of one run, and then used to initiate the seed for the next run. This eliminates the possibility of the two runs having overlapping sequences (provided the PRNG cycle is long enough, of course). But, with the ability to read the internal state of the PNRG, this would allow the user to generate the exact same sequence twice (useful for debugging, for example). Or the user could generate two sequences that have arbitrary numbers of overlapping values (the sequences offset by 1, offset by 2, offset by 100, or whatever). I can't think of any practical reason for doing this. However, since it can be done with 100% certainty, it is not "unlikely" for it to occur, assuming the user wants it to happen. In a previous post, I suggested a new (intrinsic fortran) subroutine of the form call RANDOM_NEW_SEED( old_seed, n, new_seed ) One way to implement this subroutine would be for new_seed(:) to be the internal state corresponding to the (n+1)-th element of the sequence that results (or would result) from initialization with old_seed(:). Of course, for this to be practical, especially for large values of n, this would require the PRNG algorithm to allow the internal state for arbitrary elements within a sequence to be determined in this way. That would place practical restrictions on which PRNG algorithms can be used with this interface, especially if additional features such as independence of the separate sequences are also required. $.02 -Ron Shepard
From: sturlamolden on 1 Aug 2010 21:29
I did a speed comparison on my laptop of KISS4691 against Mersenne Twister 19937. KISS4691 produced about 110 million random numbers per second, whereas MT19937 produced 118 million. If the compiler was allowed to inline KISS4691, the performance increased to 148 millon per second. The function call overhead is imporant, and without taking it into consideration, MT19937 will actually be faster than KISS4691. Also note that a SIMD oriented version of MT19937 is twice as fast as the one used in my tests. There is also a version of Mersenne Twister suitable for parallel processing and GPUs. So a parallel or SIMD version of Mersenne Twister is likely to be a faster PRNG than KISS4691. Then for the question of numerical quality: Is KISS4691 better than MT19937? I don't feel qualified to answer this. Marsaglia is the world's foremost authority on random number generators, so I trust his advice, but MT19937 is claimed to give impeckable quality except for crypto. For those who need details: The speed test was basically running the PRNGs for five seconds or more, querying Windows' performance counter on every ten million call, and finally subtracting an estimate of the timing overhead. The C compiler was Microsoft C/C++ version 15.00.30729.01 (the only optimization flag I used was /O2, i.e. maximize for speed). The processor is AMD Phenom N930 @ 2.0 Gz. Sturla Molden |