Student copula fitting in high dimension [Matlab]

Prev: Adding column headers when saving/exporting as an Excel file
Next: Color individual bars of 3D bar bar3

From: Lacoume Arnaud on 23 Feb 2010 11:50

Hello,

I have generated pseudo correlated data with the following procedure

Uni=copularnd('t',A,4,100);

I have generated 100 random number in [0,1] correlated with a Student Copula whose parameters are a correlation matrix A (12*12 see at the end) and degrees of liberty 4. After that, I fit a copula on these data

[theta nu]=copulafit('t',Uni);

I recover nu = 4.0512 (that's ok), but theta is strongly different avec my initial matrix A (see at the end).

Does anybody has an idea of the origin of this difference ? Maybe, the optimisation of the log-likehood in dimension 12 is very sensitive ?
Thank you for your help.

Arnaud Lacoume

Correlation matrix :
A is
1 0.5 0.5 0.25 0.5 0.25 0.5 0.25 0.5 0.25 0.25 0.25
0.5 1 0.25 0.25 0.25 0.25 0.5 0.5 0.5 0.25 0.25 0.25
0.5 0.25 1 0.25 0.25 0.25 0.25 0.5 0.5 0.25 0.25 0.5
0.25 0.25 0.25 1 0.25 0.25 0.25 0.5 0.5 0.25 0.25 0.5
0.5 0.25 0.25 0.25 1 0.5 0.5 0.25 0.5 0.5 0.5 0.25
0.25 0.25 0.25 0.25 0.5 1 0.5 0.25 0.5 0.5 0.5 0.25
0.5 0.5 0.25 0.25 0.5 0.5 1 0.25 0.5 0.25 0.5 0.25
0.25 0.5 0.5 0.5 0.25 0.25 0.25 1 0.5 0.5 0.25 0.25
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 0.25 0.5 0.5
0.25 0.25 0.25 0.25 0.5 0.5 0.25 0.5 0.25 1 0.25 0.25
0.25 0.25 0.25 0.25 0.5 0.5 0.5 0.25 0.5 0.25 1 0.25
0.25 0.25 0.5 0.5 0.25 0.25 0.25 0.25 0.5 0.25 0.25 1

Theta :
1.00 0.56 0.56 0.42 0.69 0.36 0.60 0.33 0.59 0.42 0.38 0.41
0.56 1.00 0.20 0.27 0.42 0.28 0.59 0.57 0.53 0.37 0.36 0.19
0.56 0.20 1.00 0.38 0.31 0.22 0.29 0.41 0.51 0.21 0.23 0.63
0.42 0.27 0.38 1.00 0.20 0.25 0.28 0.56 0.51 0.30 0.09 0.45
0.69 0.42 0.31 0.20 1.00 0.54 0.59 0.28 0.60 0.50 0.48 0.27
0.36 0.28 0.22 0.25 0.54 1.00 0.52 0.29 0.60 0.54 0.53 0.19
0.60 0.59 0.29 0.28 0.59 0.52 1.00 0.33 0.61 0.29 0.62 0.25
0.33 0.57 0.41 0.56 0.28 0.29 0.33 1.00 0.50 0.54 0.26 0.19
0.59 0.53 0.51 0.51 0.60 0.60 0.61 0.50 1.00 0.29 0.49 0.50
0.42 0.37 0.21 0.30 0.50 0.54 0.29 0.54 0.29 1.00 0.26 0.20
0.38 0.36 0.23 0.09 0.48 0.53 0.62 0.26 0.49 0.26 1.00 0.07
0.41 0.19 0.63 0.45 0.27 0.19 0.25 0.19 0.50 0.20 0.07 1.00

From: Peter Perkins on 24 Feb 2010 11:02

On 2/23/2010 11:50 AM, Lacoume Arnaud wrote:
> I have generated pseudo correlated data with the following procedure
> Uni=copularnd('t',A,4,100);
>
> I have generated 100 random number in [0,1] correlated with a Student
> Copula whose parameters are a correlation matrix A (12*12 see at the
> end) and degrees of liberty 4. After that, I fit a copula on these data
> [theta nu]=copulafit('t',Uni);
>
> I recover nu = 4.0512 (that's ok), but theta is strongly different avec
> my initial matrix A (see at the end).

Lacoume, I suspect it's simply that 100 observations is not enough to get a good estimate of the correlation matrix. You're estimating 66 correlation coefficients, after all. If you try this repeatedly with more data, you get estimates that are closer to the known true value (even when using the approximate maximum likelihood method, which is much faster, by the way).

As a simple analogy, try estimating the variance of a univariate normal distribution with 2 observations, then 5, then 10, then 100. This is a statistical problem, not a software problem.

Hope this helps.

From: Lacoume Arnaud on 25 Feb 2010 11:19

Peter Perkins <Peter.Perkins(a)MathRemoveThisWorks.com> wrote in message <hm3ii7$r0c$1(a)fred.mathworks.com>...
> On 2/23/2010 11:50 AM, Lacoume Arnaud wrote:
> > I have generated pseudo correlated data with the following procedure
> > Uni=copularnd('t',A,4,100);
> >
> > I have generated 100 random number in [0,1] correlated with a Student
> > Copula whose parameters are a correlation matrix A (12*12 see at the
> > end) and degrees of liberty 4. After that, I fit a copula on these data
> > [theta nu]=copulafit('t',Uni);
> >
> > I recover nu = 4.0512 (that's ok), but theta is strongly different avec
> > my initial matrix A (see at the end).
>
> Lacoume, I suspect it's simply that 100 observations is not enough to get a good estimate of the correlation matrix. You're estimating 66 correlation coefficients, after all. If you try this repeatedly with more data, you get estimates that are closer to the known true value (even when using the approximate maximum likelihood method, which is much faster, by the way).
>
> As a simple analogy, try estimating the variance of a univariate normal distribution with 2 observations, then 5, then 10, then 100. This is a statistical problem, not a software problem.
>
> Hope this helps.

Yes, you are right. I have tested it!
I was not thinking of that, which is the first possible check, just because I think in my head, I was in a context of having 100 data would be very good... This mean that the fit should be done by expert judgment...
Thank you

|
Pages: 1
Prev: Adding column headers when saving/exporting as an Excel file
Next: Color individual bars of 3D bar bar3