From: Peter Perkins on
On 3/9/2010 12:09 AM, Yudha wrote:
> Hi,
>
> In Matlab when we use corrcoef to do correlation we will also get p-value and the confidence boundary.
>
> The help document says :
>
> "The p-value is computed by transforming the correlation to create a t statistic having n-2 degrees of freedom, where n is the number of rows of X. The confidence bounds are based on an asymptotic normal distribution of 0.5*log((1+R)/(1-R)), with an approximate variance equal to 1/(n-3). These bounds are accurate for large samples when X has a multivariate normal distribution. The 'pairwise' option can produce an R matrix that is not positive definite."
>
> I have some queries here :
>
> 1. Does creating "t-statistic" for p-value means to assume the sample distribution of correlation is t-distribution ?, is this the same as asymptotic normal distribution that been use for the confidence bounds ?, if it's not the same then is this a correct way ?, from what I read in some references we should assume the same distribution for the significance test and confidence bound.

Yudha, these are standard approximations, and can be found in many statistics texts. The Wikipedia also describes them:

<http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient#Approaches_based_on_mathematical_approximations>

There should be references in the doc for CORRCOEF, I have made a note to add them.


> 2. How to perform different significance test and confidence boundary for the correlation score base on different distribution sample assumption (Z` or else) in matlab ?, it seems corrcoef doesn't have this facility. Because for big sample (N>30) literature says to assume Z-distribution instead of t.

The t-distribution is asymptotically equivalent to the normal for large degrees of freedom. I suspect the recommendations you have seen are suggesting normal not because that is a better approximation, but because it is easier to compute from a table.
 | 
Pages: 1
Prev: plot legend in for loop
Next: xpc target from file