Latent Class Analysis - Question [SAS]

Prev: Smarter Text Qualifying Than Just DSD on Import
Next: Error Handling

From: Oliver Kuss on 1 Dec 2009 11:02

On 1 Dez., 16:23, Ryan <ryan.andrew.bl...(a)gmail.com> wrote:
> On Dec 1, 9:15 am, Oliver Kuss <Oliver.K...(a)medizin.uni-halle.de>
> wrote:
>
>
>
>
>
> > On 1 Dez., 11:43, Ryan <ryan.andrew.bl...(a)gmail.com> wrote:
>
> > > On Dec 1, 2:48 am, Oliver Kuss <Oliver.K...(a)medizin.uni-halle.de>
> > > wrote:
>
> > > > On 1 Dez., 02:16, Ryan <ryan.andrew.bl...(a)gmail.com> wrote:
>
> > > > > Hi,
>
> > > > > Let me apologize in advance for asking the same question twice. I
> > > > > figured I'd give it another shot.
>
> > > > > Has anyone seen/developed code to run a random effects latent class
> > > > > analysis in SAS. Let's say we have three dichotomous indicator
> > > > > variables (0=No, 1=Yes) that we hypothesize load on a latent class
> > > > > variable (with 3 classes).
>
> > > > > A simple example I just made up: We suspect that there are three
> > > > > classes of people who use illicit substances (class 1 = non-users/
> > > > > abstainers, class 2 = casual users, class 3 = addicts). Assume we
> > > > > cannot measure directly if someone belongs to any of these classes,
> > > > > but we have 3 indicator variables as indicated previously. Let's also
> > > > > assume that we have two cases per person (measured at equal
> > > > > intervals)...
>
> > > > > /----------------------------------------------/
> > > > > Person Time X1 X2 X3
> > > > > 1 1 0 1 1
> > > > > 1 2 1 0 1
> > > > > 2 1 0 0 0
> > > > > 2 2 0 0 1
> > > > > .
> > > > > .
>
> > > > > N
> > > > > /----------------------------------------------/
>
> > > > > Does anyone know how to construct code (presumably in nlmixed) to run
> > > > > a random intercept LCA and compute the following:
>
> > > > > (1) Probability that a positive response on each item is associated
> > > > > with a particular class
> > > > > (2) Probability that each case is associated with a particular class
> > > > > (3) Any indication that the number of classes we selected does not
> > > > > yield the best fitting model. I assume re-running the model assuming 2
> > > > > classes, 4 classes, etc. and comparing AICs/BICs might work.
>
> > > > > Any thoughts/recommendations/references would be great.
>
> > > > > Thanks,
>
> > > > > Ryan
>
> > > > Dear Ryan,
> > > > it seems that you also have a longitudinal structure in your data set
> > > > with two (or even more) observations for each person. Then you should
> > > > definitely look at PROC TRAJ (http://www.andrew.cmu.edu/user/bjones/),
> > > > a user-written SAS prodecure that fits discrete mixture models to
> > > > longitudinal data. I once worked with it and it did fine. Before final
> > > > publication of the results I also coded the model with PROC NLP and it
> > > > yielded the same results. So you might also use PROC NLP or PROC
> > > > NLMIXED for latent class models.
>
> > > > Hope that helps,
> > > > Oliver- Hide quoted text -
>
> > > > - Show quoted text -
>
> > > Thanks for responding, Oliver. Thank you for the info about TRAJ
> > > procedure. I would prefer to run the model using the NLMIXED
> > > procedure. I assume it is possible to run such a model as evidenced by
> > > a post by Dale a while back:
>
> > >http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0503a&L=sas-l&D=0&P=26375
>
> > > What's confusing to me about Dale's post is the dependent variable.
> > > What exactly would be the dependent variable in an LCA such as the
> > > example I made up?
>
> > > Ryan- Zitierten Text ausblenden -
>
> > > - Zitierten Text anzeigen -
>
> > Dear Ryan,
> > I got your point. I admittedly do not know how such a model can be
> > coded with PROC NLMIXED but I have two more hints which might be
> > useful:
> > 1. There is a SUGI paper using PROC CATMOD (http://www2.sas.com/
> > proceedings/sugi31/201-31.pdf) for LCA and 2. There is a user-written
> > SAS procedure LCA (http://methodology.psu.edu/index.php/downloads/
> > proclcalta) whose first example has four binary indicators which
> > should be grouped in two classes (similar to your problem, without a
> > "response"). Maybe you can use PROC LCA for achieving the results for
> > your data set and then use the description of the model in the PROC
> > LCA handbook for translating the model into PROC NLMIXED.
>
> > Yours,
> > Oliver- Hide quoted text -
>
> > - Show quoted text -
>
> Who knew there were so many SAS procedures that could handle an LCA
> model?! I will certainly try to run it (without the REs) using one of
> these other procedures. Regardless, however, if I want to run a random
> effects LCA in SAS, I'm probably going to have to figure out how to do
> it in nlmixed. Thanks again for the info. -Ryan- Zitierten Text ausblenden -
>
> - Zitierten Text anzeigen -

Dear Ryan,
I'm in a mood today for searching for LCA software ;-)))
Give the LCR macro of Diana L. Miglioretti a try (http://
www.grouphealthresearch.org/perpages/migliore/downlds/LCREG2.SAS)
It makes explizit the IML code for fitting the models and maybe is an
even better starting point for coding the model in NLMIXED.

Good luck,
Oliver

From: Ryan on 1 Dec 2009 11:32

On Dec 1, 11:02 am, Oliver Kuss <Oliver.K...(a)medizin.uni-halle.de>
wrote:
> On 1 Dez., 16:23, Ryan <ryan.andrew.bl...(a)gmail.com> wrote:
>
>
>
>
>
> > On Dec 1, 9:15 am, Oliver Kuss <Oliver.K...(a)medizin.uni-halle.de>
> > wrote:
>
> > > On 1 Dez., 11:43, Ryan <ryan.andrew.bl...(a)gmail.com> wrote:
>
> > > > On Dec 1, 2:48 am, Oliver Kuss <Oliver.K...(a)medizin.uni-halle.de>
> > > > wrote:
>
> > > > > On 1 Dez., 02:16, Ryan <ryan.andrew.bl...(a)gmail.com> wrote:
>
> > > > > > Hi,
>
> > > > > > Let me apologize in advance for asking the same question twice. I
> > > > > > figured I'd give it another shot.
>
> > > > > > Has anyone seen/developed code to run a random effects latent class
> > > > > > analysis in SAS. Let's say we have three dichotomous indicator
> > > > > > variables (0=No, 1=Yes) that we hypothesize load on a latent class
> > > > > > variable (with 3 classes).
>
> > > > > > A simple example I just made up: We suspect that there are three
> > > > > > classes of people who use illicit substances (class 1 = non-users/
> > > > > > abstainers, class 2 = casual users, class 3 = addicts). Assume we
> > > > > > cannot measure directly if someone belongs to any of these classes,
> > > > > > but we have 3 indicator variables as indicated previously. Let's also
> > > > > > assume that we have two cases per person (measured at equal
> > > > > > intervals)...
>
> > > > > > /----------------------------------------------/
> > > > > > Person Time X1 X2 X3
> > > > > > 1 1 0 1 1
> > > > > > 1 2 1 0 1
> > > > > > 2 1 0 0 0
> > > > > > 2 2 0 0 1
> > > > > > .
> > > > > > .
>
> > > > > > N
> > > > > > /----------------------------------------------/
>
> > > > > > Does anyone know how to construct code (presumably in nlmixed) to run
> > > > > > a random intercept LCA and compute the following:
>
> > > > > > (1) Probability that a positive response on each item is associated
> > > > > > with a particular class
> > > > > > (2) Probability that each case is associated with a particular class
> > > > > > (3) Any indication that the number of classes we selected does not
> > > > > > yield the best fitting model. I assume re-running the model assuming 2
> > > > > > classes, 4 classes, etc. and comparing AICs/BICs might work.
>
> > > > > > Any thoughts/recommendations/references would be great.
>
> > > > > > Thanks,
>
> > > > > > Ryan
>
> > > > > Dear Ryan,
> > > > > it seems that you also have a longitudinal structure in your data set
> > > > > with two (or even more) observations for each person. Then you should
> > > > > definitely look at PROC TRAJ (http://www.andrew.cmu.edu/user/bjones/),
> > > > > a user-written SAS prodecure that fits discrete mixture models to
> > > > > longitudinal data. I once worked with it and it did fine. Before final
> > > > > publication of the results I also coded the model with PROC NLP and it
> > > > > yielded the same results. So you might also use PROC NLP or PROC
> > > > > NLMIXED for latent class models.
>
> > > > > Hope that helps,
> > > > > Oliver- Hide quoted text -
>
> > > > > - Show quoted text -
>
> > > > Thanks for responding, Oliver. Thank you for the info about TRAJ
> > > > procedure. I would prefer to run the model using the NLMIXED
> > > > procedure. I assume it is possible to run such a model as evidenced by
> > > > a post by Dale a while back:
>
> > > >http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0503a&L=sas-l&D=0&P=26375
>
> > > > What's confusing to me about Dale's post is the dependent variable.
> > > > What exactly would be the dependent variable in an LCA such as the
> > > > example I made up?
>
> > > > Ryan- Zitierten Text ausblenden -
>
> > > > - Zitierten Text anzeigen -
>
> > > Dear Ryan,
> > > I got your point. I admittedly do not know how such a model can be
> > > coded with PROC NLMIXED but I have two more hints which might be
> > > useful:
> > > 1. There is a SUGI paper using PROC CATMOD (http://www2.sas.com/
> > > proceedings/sugi31/201-31.pdf) for LCA and 2. There is a user-written
> > > SAS procedure LCA (http://methodology.psu.edu/index.php/downloads/
> > > proclcalta) whose first example has four binary indicators which
> > > should be grouped in two classes (similar to your problem, without a
> > > "response"). Maybe you can use PROC LCA for achieving the results for
> > > your data set and then use the description of the model in the PROC
> > > LCA handbook for translating the model into PROC NLMIXED.
>
> > > Yours,
> > > Oliver- Hide quoted text -
>
> > > - Show quoted text -
>
> > Who knew there were so many SAS procedures that could handle an LCA
> > model?! I will certainly try to run it (without the REs) using one of
> > these other procedures. Regardless, however, if I want to run a random
> > effects LCA in SAS, I'm probably going to have to figure out how to do
> > it in nlmixed. Thanks again for the info. -Ryan- Zitierten Text ausblenden -
>
> > - Zitierten Text anzeigen -
>
> Dear Ryan,
> I'm in a mood today for searching for LCA software ;-)))
> Give the LCR macro of Diana L. Miglioretti a try (http://www.grouphealthresearch.org/perpages/migliore/downlds/LCREG2.SAS)
> It makes explizit the IML code for fitting the models and maybe is an
> even better starting point for coding the model in NLMIXED.
>
> Good luck,
> Oliver- Hide quoted text -
>
> - Show quoted text -

Hi Oliver,

Thank you for providing even more possibilities. Yes, this might be a
good starting point. I also found an article that discussed a fairly
more complex LCA model (including latent predictors) using NLMIXED
that might help me as well. In case anybody's interested, it's located
here:

http://biostatistics.oxfordjournals.org/cgi/content/full/7/1/145

I have not reviewed this article very closely, but at first glance I
notice that the NLMIXED code in the article uses a dependent variable,
called "Dummy" which they state is a place holder.

If I figure this out, I will definitely post back the solution.

Thanks again!

Ryan

From: oloolo on 1 Dec 2009 14:32

for NLMIXED specifying a general(loglikehood) in MODEL statement, the left-
handside variable is irrelavant because the true dependent variable is
incorporated into the (log)likehood function you constructed using
programming statements.

On Tue, 1 Dec 2009 08:32:40 -0800, Ryan <ryan.andrew.black(a)GMAIL.COM> wrote:
>>
>> > > > Thanks for responding, Oliver. Thank you for the info about TRAJ
>> > > > procedure. I would prefer to run the model using the NLMIXED
>> > > > procedure. I assume it is possible to run such a model as
evidenced by
>> > > > a post by Dale a while back:
>>
>> > > >http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0503a&L=sas-
l&D=0&P=26375
>>
>> > > > What's confusing to me about Dale's post is the dependent variable.
>> > > > What exactly would be the dependent variable in an LCA such as the
>> > > > example I made up?
>>
>> > > > Ryan- Zitierten Text ausblenden -
>>
>> > > > - Zitierten Text anzeigen -
>>
>> > > Dear Ryan,
>> > > I got your point. I admittedly do not know how such a model can be
>> > > coded with PROC NLMIXED but I have two more hints which might be
>> > > useful:
>> > > 1. There is a SUGI paper using PROC CATMOD (http://www2.sas.com/
>> > > proceedings/sugi31/201-31.pdf) for LCA and 2. There is a user-written
>> > > SAS procedure LCA (http://methodology.psu.edu/index.php/downloads/
>> > > proclcalta) whose first example has four binary indicators which
>> > > should be grouped in two classes (similar to your problem, without a
>> > > "response"). Maybe you can use PROC LCA for achieving the results for
>> > > your data set and then use the description of the model in the PROC
>> > > LCA handbook for translating the model into PROC NLMIXED.
>>
>> > > Yours,
>> > > Oliver- Hide quoted text -
>>
>> > > - Show quoted text -
>>
>> > Who knew there were so many SAS procedures that could handle an LCA
>> > model?! I will certainly try to run it (without the REs) using one of
>> > these other procedures. Regardless, however, if I want to run a random
>> > effects LCA in SAS, I'm probably going to have to figure out how to do
>> > it in nlmixed. Thanks again for the info. -Ryan- Zitierten Text
ausblenden -
>>
>> > - Zitierten Text anzeigen -
>>
>> Dear Ryan,
>> I'm in a mood today for searching for LCA software ;-)))
>> Give the LCR macro of Diana L. Miglioretti a try
(http://www.grouphealthresearch.org/perpages/migliore/downlds/LCREG2.SAS)
>> It makes explizit the IML code for fitting the models and maybe is an
>> even better starting point for coding the model in NLMIXED.
>>
>> Good luck,
>> Oliver- Hide quoted text -
>>
>> - Show quoted text -
>
>Hi Oliver,
>
>Thank you for providing even more possibilities. Yes, this might be a
>good starting point. I also found an article that discussed a fairly
>more complex LCA model (including latent predictors) using NLMIXED
>that might help me as well. In case anybody's interested, it's located
>here:
>
>http://biostatistics.oxfordjournals.org/cgi/content/full/7/1/145
>
>I have not reviewed this article very closely, but at first glance I
>notice that the NLMIXED code in the article uses a dependent variable,
>called "Dummy" which they state is a place holder.
>
>If I figure this out, I will definitely post back the solution.
>
>Thanks again!
>
>Ryan

From: Dale McLerran on 2 Dec 2009 17:47

Ryan,

I looked at the code from the Guo paper to see how they
handled the latent class part of their model. It is really
pretty straightforward. For a latent class analysis without
any predictor variables and with just three binary response
variables X1, X2, and X3, syntax would be as follows:

proc nlmixed data=mydata tech=quanew lis=2 method=gauss
maxiter=1000 gconv=.00000000001 fconv=.00000000001;
parms
/* Parameter which expresses probability of */
/* latent class 1 vs latent class 2 */
alpha1 = 0.5
/* If there are more latent classes, then need */
/* alpha1, alpha2, etc. Number of alpha parms */
/* is one less than number of latent classes. */

/* Parameters which express probability of Xi=1 | latent class 1 */
bpi11 = 1 bpi21 = 1 bpi31 = 0

/* Parameters which express probability of Xi=1 | latent class 2 */
bpi12 = 0.5 bpi22 = 0.5 bpi32 = -0.5

/* Need bpi13, bpi23, etc. if three latent classes */
;

bounds -6 <= bpi11 - bpi52 <= 6;

**** latent class part;
pi11 = 1/(1+exp(-bpi11)); pi12 = 1/(1+exp(-bpi12));
pi21 = 1/(1+exp(-bpi21)); pi22 = 1/(1+exp(-bpi22));
pi31 = 1/(1+exp(-bpi31)); pi32 = 1/(1+exp(-bpi32));
prod11 = (pi11**x1)*(1-pi11)**(1-x1);
prod12 = (pi12**x1)*(1-pi12)**(1-x1);
prod21 = (pi21**x2)*(1-pi21)**(1-x2);
prod22 = (pi22**x2)*(1-pi22)**(1-x2);
prod31 = (pi31**x3)*(1-pi31)**(1-x3);
prod32 = (pi32**x3)*(1-pi32)**(1-x3);

/* model probability of each latent class */
eta1=exp(alpha1)/(1+exp(alpha1));
eta2=1/(1+exp(alpha1));

/* Construct likelihood and log likelihood */
l_latclass=eta1*prod11*prod21*prod31 +
eta2*prod12*prod22*prod32;
ll_latclass = log(l_latclass);

/* Maximize the latent class log likelihood */
model ll_latclass ˜ general(ll_latclass);
run;

Note that the initial parameter values in the code above
were specific to the problem presented in the Guo paper. It
is important that at least one of the parameters expressing
the probability of each variable X1, X2, and X3 given that
the vector of responses is from latent class 1 (bpi11,
bpi21, and bpi31) be different from the corresponding
parameter expressing the probability of a response variable
given that the vector of responses is from latent class 2
(bpi12, bpi22, and bpi32).

If you are assuming two latent classes , then you might
initialize the parameters through the following process:

1) Compute the probability of each binary response
p{i}, i=1, 2, 3

2) Compute the logit of each probability
logit(p{i}) = log( p{i} / (1 - p{i}) )

3) Assign bpi{i}1 = logit(p{i}) + 0.25
bpi{i}2 = logit(p{i}) - 0.25

If the correlation of X{1} with X{j} is negative, then
reverse the addition/subtraction operations in step 3
for j=2,3,...,J.

Now, you have random person effects. I presume that the
person effects would operate on the latent class
probability model, and not on the probabilities of each
observed response. So we could add random person effects
as follows:

proc nlmixed data=mydata tech=quanew lis=2 method=gauss
maxiter=1000 gconv=.00000000001 fconv=.00000000001;
parms
/* Parameter which expresses probability of */
/* latent class 1 vs latent class 2 */
alpha1 = 0.5
/* If there are more latent classes, then need */
/* alpha1, alpha2, etc. Number of alpha parms */
/* is one less than number of latent classes. */
log_Valpha1 -4

/* Parameters which express probability of Xi=1 | latent class 1 */
bpi11 = 1 bpi21 = 1 bpi31 = 0

/* Parameters which express probability of Xi=1 | latent class 2 */
bpi12 = 0.5 bpi22 = 0.5 bpi32 = -0.5

/* Need bpi13, bpi23, etc. if three latent classes */
;

bounds -6 <= bpi11 - bpi52 <= 6;

**** latent class part;
pi11 = 1/(1+exp(-bpi11)); pi12 = 1/(1+exp(-bpi12));
pi21 = 1/(1+exp(-bpi21)); pi22 = 1/(1+exp(-bpi22));
pi31 = 1/(1+exp(-bpi31)); pi32 = 1/(1+exp(-bpi32));
prod11 = (pi11**x1)*(1-pi11)**(1-x1);
prod12 = (pi12**x1)*(1-pi12)**(1-x1);
prod21 = (pi21**x2)*(1-pi21)**(1-x2);
prod22 = (pi22**x2)*(1-pi22)**(1-x2);
prod31 = (pi31**x3)*(1-pi31)**(1-x3);
prod32 = (pi32**x3)*(1-pi32)**(1-x3);

/* model probability of each latent class. */
/* Note that we add random effect u1 to alpha1. U1 is */
/* person random effect on probability of latent class */
/* 1. If there are more latent classes, then would */
/* need terms u2, u3, etc. */
eta1=exp(alpha + u1)/(1+exp(alpha + u1));
eta2=1/(1+exp(alpha + u1));

/* Construct likelihood and log likelihood */
l_latclass=eta1*prod11*prod21*prod31 +
eta2*prod12*prod22*prod32;
ll_latclass = log(l_latclass);

/* Maximize the latent class log likelihood */
model ll_latclass ˜ general(ll_latclass);

/* Model the variance of the latent class probability */
random u1 ~ normal([0], [exp(2*log_Valpha1)]) subject=person;
run;

Let me know how this goes. It is an interesting problem.

Dale

---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra(a)NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------

--- On Tue, 12/1/09, Ryan <ryan.andrew.black(a)GMAIL.COM> wrote:
>
> Hi Oliver,
>
> Thank you for providing even more possibilities. Yes, this
> might be a
> good starting point. I also found an article that discussed
> a fairly
> more complex LCA model (including latent predictors) using
> NLMIXED
> that might help me as well. In case anybody's interested,
> it's located
> here:
>
> http://biostatistics.oxfordjournals.org/cgi/content/full/7/1/145
>
> I have not reviewed this article very closely, but at first
> glance I
> notice that the NLMIXED code in the article uses a
> dependent variable,
> called "Dummy" which they state is a place holder.
>
> If I figure this out, I will definitely post back the
> solution.
>
> Thanks again!
>
> Ryan
>

From: Dale McLerran on 2 Dec 2009 18:06

Very well put! Wish I could have put it so eloquently
and succinctly.

Dale

---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra(a)NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------

--- On Tue, 12/1/09, oloolo <dynamicpanel(a)YAHOO.COM> wrote:

> From: oloolo <dynamicpanel(a)YAHOO.COM>
> Subject: Re: Latent Class Analysis - Question
> To: SAS-L(a)LISTSERV.UGA.EDU
> Date: Tuesday, December 1, 2009, 11:32 AM
>
> for NLMIXED specifying a general(loglikehood) in MODEL
> statement, the left-handside variable is irrelavant
> because the true dependent variable is incorporated
> into the (log)likehood function you constructed using
> programming statements.
>
>

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: Smarter Text Qualifying Than Just DSD on Import
Next: Error Handling