From: Tomaz on
Hi all!

In Matlab I know the mvnrnd(MU,SIGMA) function. This gives me a random vector drawn from multivariate normal distribution characterized by MU and SIGMA. Now I am searching for the simplest way to get a value for only one of the attributes that make up the vector (given that I know the values of the rest attributes). If I am not mistaken this would be called conditional sampling?

To illustrate: I have 5 attributes (independent variables) all together and I build multivariate normal distribution based on dataset consisting of 999 data points. Next, I have 1000. data point with value for attribute nr. 5 missing, but I do know the values of attributes 1-4. I would like to sample the value for the missing attribute based on values of other 4. I imagine that expected value should be different when I would have [-1 2 1 5 missing] than in the case of [1000 3456 221 8901 missing]. Is there any simple way to achieve this/ take the values of n-1 attributes into account when sampling?
From: Peter Perkins on
On 4/12/2010 5:54 AM, Tomaz wrote:

> In Matlab I know the mvnrnd(MU,SIGMA) function. This gives me a random
> vector drawn from multivariate normal distribution characterized by MU
> and SIGMA. Now I am searching for the simplest way to get a value for
> only one of the attributes that make up the vector (given that I know
> the values of the rest attributes). If I am not mistaken this would be
> called conditional sampling?

Tomaz, I think this Wikipedia section is what you are looking for:

<http://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions>
From: Tomaz on
Peter Perkins <Peter.Perkins(a)MathRemoveThisWorks.com> wrote in message <hpv9rm$n2d$1(a)fred.mathworks.com>...
> On 4/12/2010 5:54 AM, Tomaz wrote:
>
> > In Matlab I know the mvnrnd(MU,SIGMA) function. This gives me a random
> > vector drawn from multivariate normal distribution characterized by MU
> > and SIGMA. Now I am searching for the simplest way to get a value for
> > only one of the attributes that make up the vector (given that I know
> > the values of the rest attributes). If I am not mistaken this would be
> > called conditional sampling?
>
> Tomaz, I think this Wikipedia section is what you are looking for:
>
> <http://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions>

Peter thanks, but is is this also useful when dealing with more than 2 independent variables?
And I guess that there is no 'straightforward' way of doing this in Matlab?
From: Peter Perkins on
On 4/12/2010 12:41 PM, Tomaz wrote:

> Peter thanks, but is is this also useful when dealing with more than 2
> independent variables? And I guess that there is no 'straightforward'
> way of doing this in Matlab?

Look closer at those formulas, and the definitions above them: the formula is entirely general, and it is simple to implement in MATLAB. I'm guessing someone has already posted something like this to the MATLAB Central File Exchange, but I haven't checked.

if you have N variables, then "1" and "2" in the formula represent subsets of 1:N. That Wikipedia page happens to have things set up so that the conditioning variables are all at the end (i.e., "2" corresponds to (q+1):N)) and the unobserved variables are all at the beginning (1:q), but that's just to make the notation simpler.

Given a row vector mu and a cov matrix Sigma, define i2 as the coordinates that you are conditioning on, and i1 as everything else. Then let mu1 = mu(i1), Sigma11 = Sigma(i1,i1), etc., and apply those formulas. Two things:

1)You''ll want to do something like

Sigma1_2 = Sigma11 - Sigma21*(Sigma22\Sigma12)

and similarly for mu, rather than explicitly using INV. Type "help slash".

2) You might have trouble because that Wikipedia page has the MVN in terms of col vectors. You'll want to use row vectors. And so:

mu1_2 = mu1 - ((a-mu2)/Sigma22)*Sigma21
From: Tomaz on
Peter Perkins <Peter.Perkins(a)MathRemoveThisWorks.com> wrote in message <hpvk44$2if$1(a)fred.mathworks.com>...
> On 4/12/2010 12:41 PM, Tomaz wrote:
>
> > Peter thanks, but is is this also useful when dealing with more than 2
> > independent variables? And I guess that there is no 'straightforward'
> > way of doing this in Matlab?
>
> Look closer at those formulas, and the definitions above them: the formula is entirely general, and it is simple to implement in MATLAB. I'm guessing someone has already posted something like this to the MATLAB Central File Exchange, but I haven't checked.
>
> if you have N variables, then "1" and "2" in the formula represent subsets of 1:N. That Wikipedia page happens to have things set up so that the conditioning variables are all at the end (i.e., "2" corresponds to (q+1):N)) and the unobserved variables are all at the beginning (1:q), but that's just to make the notation simpler.
>
> Given a row vector mu and a cov matrix Sigma, define i2 as the coordinates that you are conditioning on, and i1 as everything else. Then let mu1 = mu(i1), Sigma11 = Sigma(i1,i1), etc., and apply those formulas. Two things:
>
> 1)You''ll want to do something like
>
> Sigma1_2 = Sigma11 - Sigma21*(Sigma22\Sigma12)
>
> and similarly for mu, rather than explicitly using INV. Type "help slash".
>
> 2) You might have trouble because that Wikipedia page has the MVN in terms of col vectors. You'll want to use row vectors. And so:
>
> mu1_2 = mu1 - ((a-mu2)/Sigma22)*Sigma21

Thank you Peter! I appreciate your effort and I believe I will be able to solve my problem now (with some effort). Could you please just tell me what would be 'statistical expression' that describes my problem the best? Is it 'Conditional sampling', 'Conditional distributions' or something else? Any synonyms/ alternatives? I am asking this to be able to search for related data more efficiently...
 |  Next  |  Last
Pages: 1 2 3
Prev: DAE_examples_70/ex12.m
Next: Equations depicting 2 DOF