From: Ching on
Hi all.

Please don't laugh at me if my questions seem easy to you as I'm a
novice in multilevel modeling and been doing heaps of readings about
it but unfortunately the more I read the more I don't understand. I'm
currently trying to fit a 3-level multilevel model in PROC MIXED with
DDFM=SATTERTH and MEHOD=REML and was asked to find out what the
determinants are for my log-transformed exposure response variable.
However, I'm wondering if I can only do 2-level instead of 3 somehow
based on the output from PROC MIXED such as covtest or? I've been
told output from covtest is not reliable. I've been testing each
predictor in the mixed model. Should I keep the predictor based on how
much variability it explains from the covtest output, or simply based
on the p-value or changing in the beta coefficient by comparing to
that of the model with no fixed effects? which is better when I
compare 2 models,? Their AIC or difference in -2 Res Log Likelihood?
When I compare the models, do I have to change it to "PROC MIXED
METHOD=ML"? Because I read somewhere that when comparing models based
on difference in -2 Res Log Likelihood, I'd have to use ML instead of
REML. Do I have to check the problem of collinearity as other
regression modeling?

my code is like below,
proc mixed data=tmp covtest method=reml cl;
class id A;
model log_exp = / DDFM=SATTERTH solution;
random int / subject=A;
random int / subject=A*id;
run;

What the model below is modeling when var1 is either included in the
1st or/and 2nd random statement? I know when var1 is included in the
1st random statement, it's saying you allow the var1 slope to vary
between A. But what does it mean in plain English? Under what
condition that we should model this way and does var1 have to be
fitted as fixed effect as well?
proc mixed data=tmp covtest method=reml cl;
class id A;
model log_exp = var1 / DDFM=SATTERTH solution;
random int / subject=A;
random int / subject=A*id;
run;

Thanks so much for taking your time reading my problem. Have a great
weekend.
From: Dale McLerran on
Ching,

There is nothing to laugh at in any of your questions.
They are all good questions. I, too, had to go through a
long period where the more that I read, the more confused
I became. That is what learning often entails.

If I understand your question correctly, you want to
determine whether var1 should be included as a random
effect in either of your RANDOM statements. By the
way, including var1 as a random effect would mean that
the slope for var1 differed across subjects (levels of
variable A or level of interaction of A with ID). I
am not sure how to state that more succinctly. However,
you might look at the following source for an example
where random slopes are shown graphically:

http://www.cmm.bristol.ac.uk/MLwiN/tech-support/workshops/materials/Randomslope.pdf

You undoubtedly want to keep var1 in your MODEL
statement (unless you think that the average slope
across all subjects would be 0 - probably not a
reasonable assumption to make).

I would suggest using a likelihood ratio test to examine
the effect of including var1 as random. When testing
hypotheses about the random effects, it is not
necessary to specify METHOD=ML. METHOD=REML is fine
for testing hypotheses about the random effects. If
you wanted to test an assumption about var1 as a
fixed effect (on the MODEL statement), then you would
need to specify METHOD=ML for both restricted and
full models. But your interest is in whether you
need var1 on the RANDOM statement. For such tests,
METHOD=REML is just fine.

When you can use a likelihood ratio test, I prefer
that to using some information criterion like AIC for
comparing models.

I think that just about covers all your questions.
Hope these responses have helped.

Dale

---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra(a)NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------


--- On Thu, 11/19/09, Ching <kcwong5(a)GMAIL.COM> wrote:

> From: Ching <kcwong5(a)GMAIL.COM>
> Subject: Multilevel modeling in proc mixed
> To: SAS-L(a)LISTSERV.UGA.EDU
> Date: Thursday, November 19, 2009, 10:35 PM
> Hi all.
>
> Please don't laugh at me if my questions seem easy to you
> as I'm a
> novice in multilevel modeling and been doing heaps of
> readings about
> it but unfortunately the more I read the more I don't
> understand. I'm
> currently trying to fit a 3-level multilevel model in PROC
> MIXED with
> DDFM=SATTERTH and MEHOD=REML and was asked to find out what
> the
> determinants are for my log-transformed exposure response
> variable.
> However, I'm wondering if I can only do 2-level instead of
> 3 somehow
> based on the output from PROC MIXED such as covtest
> or? I've been
> told output from covtest is not reliable. I've been testing
> each
> predictor in the mixed model. Should I keep the predictor
> based on how
> much variability it explains from the covtest output, or
> simply based
> on the p-value or changing in the beta coefficient by
> comparing to
> that of the model with no fixed effects? which is
> better when I
> compare 2 models,? Their AIC or difference in -2 Res Log
> Likelihood?
> When I compare the models, do I have to change it to "PROC
> MIXED
> METHOD=ML"? Because I read somewhere that when comparing
> models based
> on difference in -2 Res Log Likelihood, I'd have to use ML
> instead of
> REML. Do I have to check the problem of collinearity as
> other
> regression modeling?
>
> my code is like below,
> proc mixed data=tmp covtest method=reml cl;
> class id A;
> model log_exp = /
> DDFM=SATTERTH solution;
> random int / subject=A;
> random int / subject=A*id;
> run;
>
> What the model below is modeling when var1 is either
> included in the
> 1st or/and 2nd random statement? I know when var1 is
> included in the
> 1st random statement, it's saying you allow the var1 slope
> to vary
> between A. But what does it mean in plain English? Under
> what
> condition that we should model this way and does var1 have
> to be
> fitted as fixed effect as well?
> proc mixed data=tmp covtest method=reml cl;
> class id A;
> model log_exp = var1 /
> DDFM=SATTERTH solution;
> random int / subject=A;
> random int / subject=A*id;
> run;
>
> Thanks so much for taking your time reading my problem.
> Have a great
> weekend.
>
From: sudip chatterjee on
Ching,

a) If you try to compare 2 models then you have to set the method =
ML option in proc mixed

what I will say is that, you can visit the UCLA multilevel modleing site
http://www.ats.ucla.edu/stat ( then you can go to multilevel
modeling section from there ) or else you can start from here
:(http://www.ats.ucla.edu/stat/sas/topics/MLM.htm)

b) About keeping the predictors, I think that is very much dependent
on your theoretical model rather than just looking at your covtest or
p values

c) As you have not described your study and your goal it is hard to
interpret your model code; but If you browse some SAS-L previous post
you will get some post where you will find more easier way to code
3-level model

d) if you think that your predictor is varying across your study
area/subjects then you keep your predictor in the random sttement: (
you can go to the proc Mixed manual in SAS it has lot of examples and
help)

I hope this helps you ( someway or somewhat)

All the best !

On 11/20/09, Ching <kcwong5(a)gmail.com> wrote:
> Hi all.
>
> Please don't laugh at me if my questions seem easy to you as I'm a
> novice in multilevel modeling and been doing heaps of readings about
> it but unfortunately the more I read the more I don't understand. I'm
> currently trying to fit a 3-level multilevel model in PROC MIXED with
> DDFM=SATTERTH and MEHOD=REML and was asked to find out what the
> determinants are for my log-transformed exposure response variable.
> However, I'm wondering if I can only do 2-level instead of 3 somehow
> based on the output from PROC MIXED such as covtest or? I've been
> told output from covtest is not reliable. I've been testing each
> predictor in the mixed model. Should I keep the predictor based on how
> much variability it explains from the covtest output, or simply based
> on the p-value or changing in the beta coefficient by comparing to
> that of the model with no fixed effects? which is better when I
> compare 2 models,? Their AIC or difference in -2 Res Log Likelihood?
> When I compare the models, do I have to change it to "PROC MIXED
> METHOD=ML"? Because I read somewhere that when comparing models based
> on difference in -2 Res Log Likelihood, I'd have to use ML instead of
> REML. Do I have to check the problem of collinearity as other
> regression modeling?
>
> my code is like below,
> proc mixed data=tmp covtest method=reml cl;
> class id A;
> model log_exp = / DDFM=SATTERTH solution;
> random int / subject=A;
> random int / subject=A*id;
> run;
>
> What the model below is modeling when var1 is either included in the
> 1st or/and 2nd random statement? I know when var1 is included in the
> 1st random statement, it's saying you allow the var1 slope to vary
> between A. But what does it mean in plain English? Under what
> condition that we should model this way and does var1 have to be
> fitted as fixed effect as well?
> proc mixed data=tmp covtest method=reml cl;
> class id A;
> model log_exp = var1 / DDFM=SATTERTH solution;
> random int / subject=A;
> random int / subject=A*id;
> run;
>
> Thanks so much for taking your time reading my problem. Have a great
> weekend.
>
From: Dale McLerran on
--- On Tue, 11/24/09, sudip chatterjee <sudip.memphis(a)GMAIL.COM> wrote:

> From: sudip chatterjee <sudip.memphis(a)GMAIL.COM>
> Subject: Re: Multilevel modeling in proc mixed
> To: SAS-L(a)LISTSERV.UGA.EDU
> Date: Tuesday, November 24, 2009, 6:38 AM
> Ching,
>
> a) If you try to compare 2 models then you have to set the
> method =
> ML option in proc mixed

This is not necessary if you are testing assumptions about
random effects. In order to test assumptions about fixed
effects, it is correct that you must specify method=ML

>
> b) About keeping the predictors, I think that is very much dependent
> on your theoretical model rather than just looking at your covtest or
> p values

I would agree. However, it is common practice to examine
assumptions about random slope effects and remove the random
slope effects from the model if the data do not support
the assumption of random slopes.

Dale

---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra(a)NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
From: Ching on
Hi Dale and Sudip.

Thanks so much for your advice. I really appreciated it.

I'll go to check out the websites for sure.

Have a wonderful day!!