PDF convolution not working out accurately [Matlab]

Prev: annotate grouped bar graph
Next: importdata() inconsistency?

From: darthshak on 7 Feb 2010 06:13

I was testing out the result that PDF of the sum of two RVs is the convolution of the individual RVs. I decided to test it with the Gaussian PDFs.

x = linspace(-10,10,10000);
f1 = normpdf(x,0,1);
f2 = normpdf(x,0,1);
f12 = conv(f1,f2);
f12 = f12(length(x)/2+1:length(f12)-length(x)/2+1); %Valid part

Understandably, at this point, the area under the PDF is not 1 because the convolution step size is 1. So,

f12 = f12*(max(x) - min(x))/length(x);

This is the answer I got for the area under the curve :
>> diff(fnval(fnint(csape(x,f12)),[min(x) max(x)]))

ans =

0.9999

It all seemed fine until I ran this command :
>> y = normpdf(x,0,2);
>> sum((y - f12).^2)

ans =

3.6937

This obviously shouldnt be happening, and the answer I should be expecting should be < 1e-10 normally.

Where have I gone wrong?

From: Matt J on 7 Feb 2010 20:32

darthshak <vishakadatta(a)gmail.com> wrote in message <735962518.127094.1265577264204.JavaMail.root(a)gallium.mathforum.org>...
> I was testing out the result that PDF of the sum of two RVs is the convolution of the individual RVs. I decided to test it with the Gaussian PDFs.
>
> x = linspace(-10,10,10000);
> f1 = normpdf(x,0,1);
> f2 = normpdf(x,0,1);
> f12 = conv(f1,f2);
> f12 = f12(length(x)/2+1:length(f12)-length(x)/2+1); %Valid part
=====================

It's not clear what'ts invalid about the part of f12 you're throwing away. If this where a continuous convolution, you wouldn't do that.

> Understandably, at this point, the area under the PDF is not 1 because the convolution step size is 1. So,
>
> f12 = f12*(max(x) - min(x))/length(x);
===================

It's probably better to do

dx=diff(x(1:2));
f12=f12/sum(f12)/dx;

> It all seemed fine until I ran this command :
> >> y = normpdf(x,0,2);
> >> sum((y - f12).^2)
>
> ans =
>
> 3.6937
>
> This obviously shouldnt be happening, and the answer I should be expecting should be < 1e-10 normally.
==============

Probably because of the piece of f12 that you lopped off as invalid. Additionally, though, rather than expecting the unnormalized sum to be small, I probably would expect the integral approximation to be so,

sum((y - f12).^2)*dx

From: darthshak on 7 Feb 2010 16:24

The reason I lopped off part of f12 was because it was created using the zero-padding.

Now I now that it is perfectly valid as the tails do indeed become 0. However, the problem is this : I need to perform 256 convolutions (yeah, you guessed right, this is a continuous convolution) and the problem with the PDF I will be using (generalized normal distribution) is that it doesnt quite approach Gaussian even with 256 convolutions.

If I don't lop off that portion of f12, f12 will grow to a huge array size at the end of 256 such convolutions. Plus, if the array size of f12 and f1 are different, I cant quite convolve the two now, can I?

From: Matt J on 8 Feb 2010 05:00

darthshak <vishakadatta(a)gmail.com> wrote in message <1792794453.128541.1265613929048.JavaMail.root(a)gallium.mathforum.org>...
> The reason I lopped off part of f12 was because it was created using the zero-padding.
==================

I don't see what zero-padding you mean. You don't have any expressions in the code you gave which zero-pad f1 and f2. The fact remains that hen you convolve two functions, you should get a wider result, whose length is the sum of the lengths of the two functions.

>
> Now I now that it is perfectly valid as the tails do indeed become 0. However, the problem is this : I need to perform 256 convolutions (yeah, you guessed right, this is a continuous convolution) and the problem with the PDF I will be using (generalized normal distribution) is that
=====================

You should probably be convolving using FFTs. In the Fourier domain, you will not have to perform 256 convolutions. You simply need to do ifft(fft(f1).^256)

>it doesnt quite approach Gaussian even with 256 convolutions.
============

Not really a surprise, since I'm still not convinced you should be truncating the convolution output.

> If I don't lop off that portion of f12, f12 will grow to a huge array size at the end of 256 such convolutions.
==========

In the example you gave, you were using 10000 samples. A 256-fold convolution will need an array that is only 20MB. The FFT of such an array is pretty fast:

>> x=rand(256*10000,1);
>> tic; fft(x); toc
Elapsed time is 0.575100 seconds.

Of course, if you want to get fancy, you can iterate for example in chunks of 10 convolutions and then reduce your sampling interval. As the convolution results get wider and smoother, they require less fine sampling.

>Plus, if the array size of f12 and f1 are different, I cant quite convolve the two now, can I?
===================

Why not? Even if you were to insist on using conv(), there is no reason its arguments need to be the same length, as shown in the following

>> f12=[1 2 3], f1= [1 2]

f12 =

1 2 3

f1 =

1 2

>> conv(f12,f1)

ans =

1 4 7 6

From: darthshak on 7 Feb 2010 21:37

What I mean by zero-padding is the extension scheme used by the conv function when the boundaries of f1 are breached during convolution.

Also, my ultimate aim is to perform a convolution where the PDFs have different means, so doing an ifft(fft(f1).^256) will be out of the question.

I'm also unable to follow what you are trying to say...it would be best if you could use f1 and f2 from the example in my first post and demonstrate how you would perform the convolution?

Also, when I compare f12 to normpdf(x,0,2) don't I need both vectors to be of the same length?

| Next | Last
Pages: 1 2 3
Prev: annotate grouped bar graph
Next: importdata() inconsistency?