From: Bill Rowe on
On 6/25/10 at 7:27 AM, stone(a)geology.washington.edu (John Stone)
wrote:

>I am trying to use RandomReal[ ] to sample from bins of different
>widths that span the interval 0 - 1. The bin widths represent the
>weights I'm assigning to a family of trial solutions in an
>optimization problem. The aim is to sample the solutions in
>proportion to their weights using a uniform distribution of random
>numbers generated by RandomReal[ ].

>For a simple example, however, suppose there are 10 equally weighted
>solutions. My selection process would use some code that looks
>like:

>weights = Table[0.1, {10}];
>bins = Accumulate[weights];
>Select[bins, (# >= RandomReal[] &)][[1]]

Rather than RandomReal you should be using RandomChoice. Specifically,

RandomChoice[weights->bins,10]

will return a list of 10 values with the desired distribution.
This can be seen by doing:

Histogram[RandomChoice[weights -> bins, 1000]]

and note with equal weights and equally spaced bins of size 0.1,
the following is equivalent

RandomInteger[{1,10}]/10//N

>Assuming the result of RandomReal[ ] is uniformly distributed, I
>expected this to return 0.1 as frequently as it returns 0.5 or 1,

No, this isn't correct. The value 0.1 will be returned whenever
RandomReal returns a value greater than or equal to 0.1 but less
than 0.2 which should happen 10% of the time. But the value 1
will be returned only if RandomReal returns the value 1 which
will happens with probability near 0.

Now consider what happens when RandomReal returns a value
greater than 0.1 but less than 0.3. This will occur ~20% of the
time. And your selection criteria will return 0.2 as the first
value in the list of selected values. That is 0.1 occurs with
probability 10%, 0.2 occurs with probability 20% an 1 occurs
with very low probability (near 0).

So, it is clear the distribution with this selection criteria
cannot be flat as you were expecting.

I haven't worked out the probability for the other values in the
list. I think the above is sufficient to show the selection
criteria you have used will not return uniform deviates.


From: Bill Rowe on
On 6/26/10 at 3:09 AM, readnews(a)sbcglobal.net (Bill Rowe) wrote:

>On 6/25/10 at 7:27 AM, stone(a)geology.washington.edu (John Stone)
>wrote:
>
>>I am trying to use RandomReal[ ] to sample from bins of different
>>widths that span the interval 0 - 1. The bin widths represent the
>>weights I'm assigning to a family of trial solutions in an
>>optimization problem. The aim is to sample the solutions in
>>proportion to their weights using a uniform distribution of random
>>numbers generated by RandomReal[ ].

>>For a simple example, however, suppose there are 10 equally
>>weighted solutions. My selection process would use some code that
>>looks like:

>weights = Table[0.1, {10}]; bins = Accumulate[weights]; Select[bins,
>(# >= RandomReal[] &)][[1]]

>Rather than RandomReal you should be using RandomChoice.
>Specifically,

>RandomChoice[weights->bins,10]

>will return a list of 10 values with the desired distribution. This
>can be seen by doing:

>Histogram[RandomChoice[weights -> bins, 1000]]

>and note with equal weights and equally spaced bins of size 0.1, the
>following is equivalent

>RandomInteger[{1,10}]/10//N

Up to this point my response was fine. RandomChoice is the thing
to use when you want random selection from a pre-defined list of
things with various weights.

But the explanation I gave for why the code didn't work as
expected is simply wrong. Peter Pain correctly pointed out
something I should have immediately realized. RandomReal
generate a new random value for each comparison made. And it is
this characteristic that causes the distribution to differ from
uniform. A simple demonstration that this is the case is to look
at the length of the lists returned that start with 0.1. That is:

In[12]:= Union[
Length /@
Cases[Table[Select[bins, (# >= RandomReal[] &)], {1000}],
{0.1, __}]]

Out[12]= {3,4,5,6,7,8,9}

If there were only one random value selected whenever the
selection was done, clearly the length of the lists with a given
starting value would be constant. The idea of using Select to
create the distribution can be made to work as follows:

With[a = RandomReal[], Select[bins, (# >= a) &]][[1]]

Repeating the demonstration above using this code yields:

In[13]:= Union[
Length /@
Cases[Table[
With[{a = RandomReal[]},
Select[bins, (# >= a) &]], {1000}], {0.1, __}]]

Out[13]= {10}

showing every list returned that starts with the value 0.1
contains all ten values.

But while this corrects the issue, this code will execute slower
than code using RandomChoice will.