From: Ron Shepard on 13 Feb 2010 23:28 In article <1jdujsc.utdmji1u3elbsN%nospam(a)see.signature>, nospam(a)see.signature (Richard Maine) wrote: > steve <kargls(a)comcast.net> wrote: > > > On Feb 13, 2:32 am, tho...(a)antispam.ham wrote: > > > cubsfan <cubsfan...(a)gmail.com> writes: > > > > If so, then simply multiply by 4E16 and then subtract 2E16. > > > > The subtraction of two large number can lead to catastrophic > > cancellation, which may lead to some undesired bias in the > > final sequence. It might be better to multiply the result > > of random_number by 2 and then subtraction 1, and then multiply > > by 2e16. > > The subtraction of two small numbers can lead to the same kind of > cancellation. Cancellation happens when you subtract two nearly equal > numbers of any magnitude. It is independent of scale (at least with > floatting point, which these numbers are). The above-proposed change > should make no difference in terms of cancellation. > > I would tend to do the subtraction before the scaling, but for a reason > having nothing to do with cancellation. I would do it just because it > keeps the scaling factor in one place in the equation instead of two. After call random_number(x) y = 2.0 * x - 1.0 then y is in the range -1.0 <= y < 1.0. That is, the range is not symmetric. Also, there are a lot of possible small values for x that all map to y=-1.0 if the expression rounds to nearest. It has never been clear to me if PRNGs are supposed to generate these small numbers or not. For most applications I use random numbers for, this doesn't matter, so I don't need to separate out the special cases. However, I've always thought that things would have been much simpler if random numbers were generated in the range 0.5 <= x < 1.0 so that they would all have the same binary exponent. This does not eliminate problems associated with subtracting two almost equal numbers (that is an intrinsic issue with floating point), but it would eliminate the problems associated with the large exponent range. Are there well-known solutions to these problems? If you want to generate PRNs that satisfy, say, -1.0 < y < 1.0, with a flat distribution, and with all possible floating point values in that range being generated, what is the best way? $.02 -Ron Shepard
From: William Clodius on 13 Feb 2010 23:52 Ron Shepard <ron-shepard(a)NOSPAM.comcast.net> wrote: > <snip> > After > > call random_number(x) > y = 2.0 * x - 1.0 > > then y is in the range -1.0 <= y < 1.0. That is, the range is not > symmetric. Also, there are a lot of possible small values for x that all > map to y=-1.0 if the expression rounds to nearest. It has never been > clear to me if PRNGs are supposed to generate these small numbers or not. > > For most applications I use random numbers for, this doesn't matter, so > I don't need to separate out the special cases. However, I've always > thought that things would have been much simpler if random numbers were > generated in the range 0.5 <= x < 1.0 so that they would all have the > same binary exponent. This does not eliminate problems associated with > subtracting two almost equal numbers (that is an intrinsic issue with > floating point), but it would eliminate the problems associated with the > large exponent range. > > Are there well-known solutions to these problems? If you want to > generate PRNs that satisfy, say, -1.0 < y < 1.0, with a flat > distribution, and with all possible floating point values in that range > being generated, what is the best way? > > $.02 -Ron Shepard Normally psuedorandom number generators initially generate a set of integers that are uniformly distributed over a range 0...2**n-1. These are then converted to a (near) uniform distribution over 0<=x< 1 by multiplying the integer by 2.**(-n) or its double precision equivalent. This give samples at the start of each interval. It can in many cases be converted to a sample at the center of each interval by adding 2.**(-n-1) to each sample, provided you know the n of the integer distribution. If the conversion is possible then the samples are over 0<x<1. Typically n is 16, 32 or 64, as they match natural integer sizes for common processors, but sometimes other sizes are used. The above assumes that 2**(-n) is larger than the relative precision of the floating point type. It will not work for n=32 for a single precison float or n=64 for a double precision float. In those cases rounding will cause the samples intended to be near 1 to be exactly 1. I have recently been thinking than an n=48 would be useful for double precision random number generators for that reason. An n=48 can be implemented by combininb a high quality n=32 with a high qhality n=16 that are uncorrelated with one another. -- Bill Clodius los the lost and net the pet to email
From: Louis Krupp on 14 Feb 2010 00:40 Ron Shepard wrote: <snip> > Are there well-known solutions to these problems? If you want to > generate PRNs that satisfy, say, -1.0 < y < 1.0, with a flat > distribution, and with all possible floating point values in that range > being generated, what is the best way? To quote E P Chandler in another part of this thread, if your random number generator returns values in [0, 1): .... draw a second random number, U2. If U2 < 0.5 then U1=-U1. The only problem I can see is that zero will turn up twice as often as it should. I suppose you could do this, as you mentioned: call random_number(x) y = 2.0 * x - 1.0 and toss out y values of -1. If, as you also mentioned, you want to restrict the random number range to 0.5 <= y < 1.0, you could throw away values that are outside that range. Louis
From: glen herrmannsfeldt on 14 Feb 2010 02:33
Ron Shepard <ron-shepard(a)nospam.comcast.net> wrote: (snip) > call random_number(x) > y = 2.0 * x - 1.0 > then y is in the range -1.0 <= y < 1.0. That is, the range is not > symmetric. Also, there are a lot of possible small values for x that all > map to y=-1.0 if the expression rounds to nearest. It has never been > clear to me if PRNGs are supposed to generate these small numbers or not. If it does generate them, those bits will be lost in the subtraction. The ones I have seen in detail take a fixed number of bits, add an exponent, and then normalize it. In that case, no extra bits will be generated. People like to mention the asymmetry in the range, but I am not convinced that it matters. With 0.0<=x<1.0 the mean is 0.5 and standard deviation, if I figured it right, of 1/24th. The standard deviation of the mean, then, is 1/24/sqrt(N) averaging N values. For a statistically significant deviation from the mean, it has to be greater than 1/24/sqrt(N). I suppose if the RNG only supplies 16 random bits then you could notice, but 24 bits in single precision will require N to be more than 2**68. With 53 bits in double precision N should be more than 2**110. > For most applications I use random numbers for, this doesn't matter, so > I don't need to separate out the special cases. However, I've always > thought that things would have been much simpler if random numbers were > generated in the range 0.5 <= x < 1.0 so that they would all have the > same binary exponent. This does not eliminate problems associated with > subtracting two almost equal numbers (that is an intrinsic issue with > floating point), but it would eliminate the problems associated with the > large exponent range. The standard requires very little, other than a uniform distribution that is 0.0<=x<1.0. > Are there well-known solutions to these problems? If you want to > generate PRNs that satisfy, say, -1.0 < y < 1.0, with a flat > distribution, and with all possible floating point values in that range > being generated, what is the best way? The chance of those small numbers being generated is extremely small. Generate more than 1024 random bits, count the number of leading zeros, take the next 53 bits, and add the appropriate exponent. The probability of a uniform binary number having N leading zeros is 2.**(-N). Not likely you will ever see those really small values. -- glen |