randsample bug? [Matlab]

Prev: Matlab 2010a install error on Kubuntu 10.04 Lucid
Next: hybrid simulation of signaling pathways..

From: Daniel on 14 Jun 2010 15:25

Peter Perkins <Peter.Perkins(a)MathRemoveThisWorks.com> wrote in message <h2t79j$1s$2(a)fred.mathworks.com>...
> MZ wrote:
>
> > > Peter, my weighting function is fairly long. 2001 elements. I can send
> > > it to you (as a .mat file?) if you give me an email address.
>
> Yes, I would be interested to see that, to confirm my suspicions. A mat file would be best, if possible. My address is as in the msg header, with obvious modification.
>
> > > I'm normalizing the cumsum(p) element. It shouldn't NEED normalization
> > > since p was normalized. But for some reason it does. Now it works
> > > every time.
> > >
> > > Also, if it helps, I'm using single not double precision for everything.
>
> Well, that's accumulated round-off error. Single precision makes it a whole lot easier to run into. What I'm confused about is that you said
>
> > > In fact, when I ask it if max(cumsum(p))>1, it returns true.
>
> and that would not seem to cause the problem you saw (perhaps you meant "<"?). In any case, thanks for pointing this out, and thanks in advance for sending those weights.
>
> - Peter Perkins
> The MathWorks, Inc.
>

I have come across this problem as well in R2009b. I have replicated the problem as follows:

>> w=[rand(1,20).*(10.^(-1:-1:-20))];
>> w'

ans =

5.107718317110735e-02
9.035816954336645e-03
5.019622603074214e-05
5.082034411944370e-06
6.672176771896132e-06
3.056483301112950e-07
2.847443517874017e-08
4.330449395779038e-09
2.409516966699802e-10
7.536455578748046e-11
3.252848495157920e-12
3.836686939640874e-13
3.099073689710823e-14
7.433976002934081e-15
4.534970427278192e-16
1.603071891483976e-17
2.871715469932097e-18
8.429440789615370e-19
7.805564204060190e-20
6.260292404284273e-21

>> p = w/sum(w);
>> e = [0 cumsum(p)];
>> e(end) = 1;
>> diff(e(end-5:end))'

ans =

2.220446049250313e-16
0
0
0
-2.220446049250313e-16

>> max(cumsum(p(1:end-1)))>1

ans =

1

The last line shows that by making the last element in "e" equal to 1 (as is done in randsample.m for the edges) creates a vector that is not "monotonically non-decreasing" as required by histc:

>> randsample(20,20,true,w);

??? Error using ==> histc
Edge vector must be monotonically non-decreasing.

Error in ==> randsample at 101
[~, y] = histc(rand(k,1),edges);

101 [~, y] = histc(rand(k,1),edges);

By setting edges(end) = 1, it seems that there is an assumption that the round-off accumulation occurs in the the last element of the weight vector (and not earlier in the vector as in the example above).

--Danny Dunlavy

From: Peter Perkins on 14 Jun 2010 16:15

On 6/14/2010 3:25 PM, Daniel wrote:

> I have come across this problem as well in R2009b. I have replicated the
> problem as follows:

Daniel, you are right. It has been fixed in R2010a, though for some
reason there was not a published bug report up on the web. I will try
to get one posted as soon as possible. A work-around for your case
seems to be to compute

p = w/sum(w);
e = min([0 cumsum(p)],1);
e(end) = 1;
p = diff(e);

from the original weights, and pass that p into randsample. You can
also modify RANDSAMPLE itself if you have permissions for your
installation -- the mod is just the second line above. Hope this helps.

|
Pages: 1
Prev: Matlab 2010a install error on Kubuntu 10.04 Lucid
Next: hybrid simulation of signaling pathways..