From: Ching on 17 Jan 2010 03:13 Hi everyone. I'm testing the possible multicollinearity among the fixed effects in proc mixed model with the option corrb in the model statement but wondering whether I should specify method=reml or method=ml? Your help is greatly appreciated. Have a good week.
From: Dale McLerran on 22 Jan 2010 14:23 --- On Thu, 1/21/10, Ryan <ryan.andrew.black(a)GMAIL.COM> wrote: > From: Ryan <ryan.andrew.black(a)GMAIL.COM> > Subject: Re: proc mixed: multicolinearity > To: SAS-L(a)LISTSERV.UGA.EDU > Date: Thursday, January 21, 2010, 6:17 PM > On Jan 19, 2:32 pm, stringplaye...(a)YAHOO.COM > (Dale McLerran) wrote: > > --- On Sun, 1/17/10, Peter Flom <peterflomconsult...(a)MINDSPRING.COM> > wrote: > > > > > > I suppose that one could construct an example where collinearity > > statistics should be examined at the subject level. In general, > > I would not worry about assessing collinearity within subjects. > > > > Mixed model parameter estimates for the fixed effects are obtained > > by minimizing > > > > (Y - X*beta)' * V^(-1) * (Y - X*beta) > > > > where X is the covariate matrix for all subjects. Thus, we > > are primarily concerned with collinearity that is manifest > > across all subjects. The variance term in the middle is > > obtained as > > > > V = Z*G*Z' + R > > > > where Z is the within-subjects design matrix. Now, the > > efficiency of our estimate of beta depends on having a good > > representation of V. To the extent that collinearity does > > not affect the estimate of a predicted value but whether > > we are able to correctly state the effect of each variable on > > that estimate, we don't really care whether V is attributable > > to column Z1, Z2, ..., Zk. As long as we get the estimate of > > the between subject variance correct, then collinearity becomes > > an issue only for the complete design matrix X. > > > > I would note that the solution for beta hat in mixed models > > is actually no different than the solution for beta hat in a > > logistic regression model or a Poisson regression model or > > any of the other generalized linear models solutions. In all > > of these, we obtain beta hat as > > > > beta hat = (X'*W*X)^(-1) * X'*W*Y > > > > where W=V^(-1) in the mixed models analysis. > > > > So, to the extent that one can use PROC REG to assess collinearity > > for a generalized linear model, then one could use PROC REG to > > assess collinearity in a mixed model setting. However, the > > above solution can be rewritten as > > > > beta hat = (X'*W^(1/2) * W^(1/2)*X)^(-1) > > * X'*W^(1/2) * W^(1/2)*Y > > > > = (U'*U)^(-1) * U'*Z > > > > where W^(1/2) is symmetric so that X'*W^(1/2)=(W^(1/2)*X)'=U'. > > If one wanted to really be diligent, then one would check > > collinearity employing the design matrix U. But the general > > advice to check collinearity employing the design matrix X > > (which includes the intercept term) should provide a reasonable > > evaluuation of collinearity. > > > > Dale > > > > --------------------------------------- > > Dale McLerran > > Fred Hutchinson Cancer Research Center > > mailto: dmclerra(a)NO_SPAMfhcrc.org > > Ph: (206) 667-2926 > > Fax: (206) 667-5977 > > ---------------------------------------- Hide quoted > text - > > > > - Show quoted text - > > Dale, > > I'm not sure if you'll see this post, but if you do and have the time, > I'd be very interested in your thoughts on to two loosely related > questions to your post. > > (1) I recently computed beta hats for a random intercept model using > the formula you presented... > > beta hat = ((X'*(V)^(-1)*X)^(-1)) * X'*(V)^(-1)*Y > > However since X was overparameterized, I took the generalized inverse > instead of taking the true inverse. Although I obtained correct > results, I would like to confirm that using the generalized inverse > function is valid. > > (2) I noticed the beta coefficients were exactly the same as those > obtained from the fixed effects model. This led me to wonder if this > will always turn out to be the case in the special circumstance where > the only random term is a random intercept and the design is fully > balanced with no missing data. > > Thanks, > > Ryan > Ryan, With respect to your first question, yes, the generalized inverse is appropriate when X'X is not full rank. I haven't investigated the properties of the fully balanced design, but I believe that it is probably true that in a fully balanced design, the point estimates for the random effect model would be the same as the point estimates for the fixed effect model. The code below certainly bears this out empirically. In this code, I generate data for 10 experiments. Each experiment has 20 clusters, the first 10 of which are assigned to arm=0 and the second 10 of which are assigned to arm=1. Thus, the data are fully balanced with respect to treatment arm. I also include a covariate X. Subsequently, the MIXED procedure is invoked twice, once specifying only a fixed effect model with intercept and arm. The second invocation of the MIXED procedure adds in the random intercept effect with clusters nested within arm. Note that for both invocations of the MIXED procedure, the covariate is excluded from the model. Parameter estimates for each run are saved to an output data set and then the COMPARE procedure is invoked to examine whether the point estimates are the same for the fixed and random effect models. We observe that the point estimates are (for all practical purposes, and likely theoretically) identical. It is worth observing that the point estimates are identical even though we estimated the wrong model since we did not include the covariate X. When X is added to the model, the design is no longer fully balanced. This is seen in a second round of invocations of the MIXED procedure followed by a PROC COMPARE. /* Data generation */ data test; v_clust=4; v_resid=1; do rep=1 to 10; do cluster=1 to 20; gamma = sqrt(v_clust)*rannor(1234579); do i=1 to 20; arm = (i>10); x = rannor(1234579); y = 3 + 2*arm + 4*x + gamma + rannor(1234579); output; end; end; end; keep rep cluster arm x y; run; title "Balanced design with fixed effects only"; ods output SolutionF=ParmsFixed; proc mixed data=test; by rep; class arm; model y = arm / s; run; title "Balanced design with random intercept effects"; ods output SolutionF=ParmsRandom; proc mixed data=test; by rep; class cluster arm; model y = arm / s; random intercept / subject=cluster(arm); run; title "Compare parameters for the balanced design models"; proc compare base=ParmsFixed(keep=estimate) compare=ParmsRandom(keep=estimate); run; title "Covariate added to fixed effect model"; ods output SolutionF=ParmsFixed; proc mixed data=test; by rep; class arm; model y = arm x / s; run; title "Covariate added to random effect model"; ods output SolutionF=ParmsRandom; proc mixed data=test; by rep; class cluster arm; model y = arm x / s; random intercept / subject=cluster(arm); run; title "Compare parameters estimated with covariate in model"; proc compare base=ParmsFixed(keep=estimate) compare=ParmsRandom(keep=estimate); run; This is only empirical confirmation of what you have observed. However, it is not unreasonable to believe that for a fully balanced design, the random effects don't influence the parameters which enter as fixed effects. And given that we observe this empirically for 10 trials, it would seem a pretty safe bet that we should observe this in the future. Dale --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra(a)NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------
From: Ryan on 22 Jan 2010 17:59 On Jan 22, 2:23 pm, stringplaye...(a)YAHOO.COM (Dale McLerran) wrote: > --- On Thu, 1/21/10, Ryan <ryan.andrew.bl...(a)GMAIL.COM> wrote: > > > > > > > From: Ryan <ryan.andrew.bl...(a)GMAIL.COM> > > Subject: Re: proc mixed: multicolinearity > > To: SA...(a)LISTSERV.UGA.EDU > > Date: Thursday, January 21, 2010, 6:17 PM > > On Jan 19, 2:32 pm, stringplaye...(a)YAHOO.COM > > (Dale McLerran) wrote: > > > --- On Sun, 1/17/10, Peter Flom <peterflomconsult...(a)MINDSPRING.COM> > > wrote: > > > > I suppose that one could construct an example where collinearity > > > statistics should be examined at the subject level. In general, > > > I would not worry about assessing collinearity within subjects. > > > > Mixed model parameter estimates for the fixed effects are obtained > > > by minimizing > > > > (Y - X*beta)' * V^(-1) * (Y - X*beta) > > > > where X is the covariate matrix for all subjects. Thus, we > > > are primarily concerned with collinearity that is manifest > > > across all subjects. The variance term in the middle is > > > obtained as > > > > V = Z*G*Z' + R > > > > where Z is the within-subjects design matrix. Now, the > > > efficiency of our estimate of beta depends on having a good > > > representation of V. To the extent that collinearity does > > > not affect the estimate of a predicted value but whether > > > we are able to correctly state the effect of each variable on > > > that estimate, we don't really care whether V is attributable > > > to column Z1, Z2, ..., Zk. As long as we get the estimate of > > > the between subject variance correct, then collinearity becomes > > > an issue only for the complete design matrix X. > > > > I would note that the solution for beta hat in mixed models > > > is actually no different than the solution for beta hat in a > > > logistic regression model or a Poisson regression model or > > > any of the other generalized linear models solutions. In all > > > of these, we obtain beta hat as > > > > beta hat = (X'*W*X)^(-1) * X'*W*Y > > > > where W=V^(-1) in the mixed models analysis. > > > > So, to the extent that one can use PROC REG to assess collinearity > > > for a generalized linear model, then one could use PROC REG to > > > assess collinearity in a mixed model setting. However, the > > > above solution can be rewritten as > > > > beta hat = (X'*W^(1/2) * W^(1/2)*X)^(-1) > > > * X'*W^(1/2) * W^(1/2)*Y > > > > = (U'*U)^(-1) * U'*Z > > > > where W^(1/2) is symmetric so that X'*W^(1/2)=(W^(1/2)*X)'=U'. > > > If one wanted to really be diligent, then one would check > > > collinearity employing the design matrix U. But the general > > > advice to check collinearity employing the design matrix X > > > (which includes the intercept term) should provide a reasonable > > > evaluuation of collinearity. > > > > Dale > > > > --------------------------------------- > > > Dale McLerran > > > Fred Hutchinson Cancer Research Center > > > mailto: dmclerra(a)NO_SPAMfhcrc.org > > > Ph: (206) 667-2926 > > > Fax: (206) 667-5977 > > > ---------------------------------------- Hide quoted > > text - > > > > - Show quoted text - > > > Dale, > > > I'm not sure if you'll see this post, but if you do and have the time, > > I'd be very interested in your thoughts on to two loosely related > > questions to your post. > > > (1) I recently computed beta hats for a random intercept model using > > the formula you presented... > > > beta hat = ((X'*(V)^(-1)*X)^(-1)) * X'*(V)^(-1)*Y > > > However since X was overparameterized, I took the generalized inverse > > instead of taking the true inverse. Although I obtained correct > > results, I would like to confirm that using the generalized inverse > > function is valid. > > > (2) I noticed the beta coefficients were exactly the same as those > > obtained from the fixed effects model. This led me to wonder if this > > will always turn out to be the case in the special circumstance where > > the only random term is a random intercept and the design is fully > > balanced with no missing data. > > > Thanks, > > > Ryan > > Ryan, > > With respect to your first question, yes, the generalized > inverse is appropriate when X'X is not full rank. I haven't > investigated the properties of the fully balanced design, > but I believe that it is probably true that in a fully > balanced design, the point estimates for the random effect > model would be the same as the point estimates for the fixed > effect model. The code below certainly bears this out > empirically. > > In this code, I generate data for 10 experiments. Each experiment > has 20 clusters, the first 10 of which are assigned to arm=0 and > the second 10 of which are assigned to arm=1. Thus, the data are > fully balanced with respect to treatment arm. I also include a > covariate X. > > Subsequently, the MIXED procedure is invoked twice, once > specifying only a fixed effect model with intercept and arm. > The second invocation of the MIXED procedure adds in the > random intercept effect with clusters nested within arm. Note > that for both invocations of the MIXED procedure, the covariate > is excluded from the model. Parameter estimates for each run > are saved to an output data set and then the COMPARE procedure > is invoked to examine whether the point estimates are the same > for the fixed and random effect models. We observe that the > point estimates are (for all practical purposes, and likely > theoretically) identical. > > It is worth observing that the point estimates are identical > even though we estimated the wrong model since we did not > include the covariate X. When X is added to the model, the > design is no longer fully balanced. This is seen in a second > round of invocations of the MIXED procedure followed by a > PROC COMPARE. > > /* Data generation */ > data test; > v_clust=4; > v_resid=1; > do rep=1 to 10; > do cluster=1 to 20; > gamma = sqrt(v_clust)*rannor(1234579); > do i=1 to 20; > arm = (i>10); > x = rannor(1234579); > y = 3 + 2*arm + 4*x + gamma + rannor(1234579); > output; > end; > end; > end; > keep rep cluster arm x y; > run; > > title "Balanced design with fixed effects only"; > ods output SolutionF=ParmsFixed; > proc mixed data=test; > by rep; > class arm; > model y = arm / s; > run; > > title "Balanced design with random intercept effects"; > ods output SolutionF=ParmsRandom; > proc mixed data=test; > by rep; > class cluster arm; > model y = arm / s; > random intercept / subject=cluster(arm); > run; > > title "Compare parameters for the balanced design models"; > proc compare base=ParmsFixed(keep=estimate) > compare=ParmsRandom(keep=estimate); > run; > > title "Covariate added to fixed effect model"; > ods output SolutionF=ParmsFixed; > proc mixed data=test; > by rep; > class arm; > model y = arm x / s; > run; > > title "Covariate added to random effect model"; > ods output SolutionF=ParmsRandom; > proc mixed data=test; > by rep; > class cluster arm; > model y = arm x / s; > random intercept / subject=cluster(arm); > run; > > title "Compare parameters estimated with covariate in model"; > proc compare base=ParmsFixed(keep=estimate) > compare=ParmsRandom(keep=estimate); > run; > > This is only empirical confirmation of what you have observed. > However, it is not unreasonable to believe that for a fully > balanced design, the random effects don't influence the > parameters which enter as fixed effects. And given that we > observe this empirically for 10 trials, it would seem a pretty > safe bet that we should observe this in the future. > > Dale > > --------------------------------------- > Dale McLerran > Fred Hutchinson Cancer Research Center > mailto: dmclerra(a)NO_SPAMfhcrc.org > Ph: (206) 667-2926 > Fax: (206) 667-5977 > ---------------------------------------- Hide quoted text - > > - Show quoted text - Dale, Thank you for responding to my initial question and empirically testing the second question. I ran the code you presented and needless to say I continue to observe what I'd expect. Hope you have an enjoyable weekend. Ryan
|
Pages: 1 Prev: DO While in Macro Next: Which process lock a table on Win environment |