From: Ryan on 4 Dec 2009 02:57 Hi Dale, When I tried to run the model late last night, I realized immediately that the REs component was misspecified. Thanks for the correct (assuming independent correlations) specification. I have a follow-up question, which I promise won't be more than a few lines of code. :) When I think about the LCA, I think about wanting at least two outputs: (1) probability that a (+) response on an item is associated with each latent class (2) probability that each observation is associated with each latent class I believe you solved problem (2) with the Estimate statements in the previous thread, which was based on an example of 3 manifest dichotomous variables and 2 latent classes. I'd like to try to solve for at least one combination of responses to all items for my model (assuming 19 dichotomousmanifest variables and 6 latent classes) to see if I'm doing this correctly. Here goes nothing... / *-------------------------------------------------------------------------------------- */ /*Calculate the probabililty that the observation belongs to */ /* latent class 1, given a (-) */ /* response on all items except the final 19th item */ / *-------------------------------------------------------------------------------------- */ estimate "P(LC=1|x1=0,x2=0,x3=0,...,x19=1)" (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191)) / (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191) + eta2*(1-pi12)*(1-pi22)*(1-pi32)*...*(pi192) + eta3*(1-pi13)*(1-pi23)*(1-pi33)*...*(pi193) + eta4*(1-pi14)*(1-pi24)*(1-pi34)*...*(pi194) + eta5*(1-pi15)*(1-pi25)*(1-pi35)*...*(pi195) + eta6*(1-pi16)*(1-pi26)*(1-pi36)*...*(pi196)); /--------------------------------------------------------------------------------------- */ Does that seem correct to you? If I'm even remotely correct, this could take me a long, long time to write all possible combinations! I assume a macro might work here. I'll have to look into using one--new territory for me. Best, Ryan p.s. I plan on taking your advice and using underscores in the actual code.
From: Dale McLerran on 4 Dec 2009 02:01 Ryan, I looked further, too, at your RANDOM statement. That is also not properly specified. You had: random u1 u2 u3 u4 u5 ~ normal([0], [exp(2*log_Valpha1)], [0], [exp(2*log_Valpha2)], [0], [exp(2*log_Valpha3)], [0], [exp(2*log_Valpha4)], [0], [exp(2*log_Valpha5)] ) subject=person; What you were apparently trying to specify is that each random effect has mean zero and specified variance. Further, it appears that you want to assume that the random effects are uncorrelated. It is probably reasonable to assume that the latent class random effects are uncorrelated (although this is something that you might want to check through a likelihood ratio test). You'll note that I stated what you were "apparently" trying to specify. The construction of your random statement is correct only through "normal(". After that, you must specify all of the expected values for u1 through u5 followed by a covariance matrix of the random effects. Actually, we only need to specify the upper triangular part of the covariance matrix. So, assuming that the random effects are independent, proper specification of your random statement would be random u1 u2 u3 u4 u5 ~ normal([0,0,0,0,0], [exp(2*log_Valpha1)], 0, 0, 0, 0, [exp(2*log_Valpha2)], 0, 0, 0, [exp(2*log_Valpha3)], 0, 0, [exp(2*log_Valpha4)], 0, [exp(2*log_Valpha5)] ) subject=person; Dale --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra(a)NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------
From: Dale McLerran on 4 Dec 2009 13:37 --- On Thu, 12/3/09, Ryan <ryan.andrew.black(a)GMAIL.COM> wrote: > From: Ryan <ryan.andrew.black(a)GMAIL.COM> > Subject: Re: Latent Class Analysis via NLMIXED - UPDATE > To: SAS-L(a)LISTSERV.UGA.EDU > Date: Thursday, December 3, 2009, 11:57 PM > Hi Dale, > > When I tried to run the model late last night, I realized immediately > that the REs component was misspecified. Thanks for the correct > (assuming independent correlations) specification. I have a follow-up > question, which I promise won't be more than a few lines of code. :) > > When I think about the LCA, I think about wanting at least two > outputs: > > (1) probability that a (+) response on an item is associated with > each latent class > (2) probability that each observation is associated with each > latent class > > I believe you solved problem (2) with the Estimate statements in > the previous thread, which was based on an example of 3 manifest > dichotomous variables and 2 latent classes. I'd like to try to > solve for at least one combination of responses to all items > for my model (assuming 19 dichotomous manifest variables and 6 > latent classes) to see if I'm doing this correctly. > > Here goes nothing... > > /*---------------------------------------------------------------*/ > /* Calculate the probabililty that the observation belongs to */ > /* latent class 1, given a (-) response on all items except the */ > /* final 19th item */ > /*---------------------------------------------------------------*/ > estimate "P(LC=1|x1=0,x2=0,x3=0,...,x19=1)" > (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191)) / > (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191) + > eta2*(1-pi12)*(1-pi22)*(1-pi32)*...*(pi192) + > eta3*(1-pi13)*(1-pi23)*(1-pi33)*...*(pi193) + > eta4*(1-pi14)*(1-pi24)*(1-pi34)*...*(pi194) + > eta5*(1-pi15)*(1-pi25)*(1-pi35)*...*(pi195) + > eta6*(1-pi16)*(1-pi26)*(1-pi36)*...*(pi196)); > > > Does that seem correct to you? > This seems to be a solution to problem #2 at the top of your post: Given a particular manifest variable combination, what is the probability of the observation belonging in latent class 1. For problem #2, your code is correct. However, that is, as you noted at the top, something which I had already addressed in a post yesterday. My understanding of what you want as a solution to problem #1 is as follows: Suppose we only observed X19=1. What is the probability that an observation with X19=1 belongs to latent class 1? For the solution to that problem, see further down in this post. But first, just a brief comment on the label part of your ESTIMATE statement. With an X vector of length 19, I would write the label part of the ESTIMATE statement (the quoted part) as: estimate "P(LC=1|x=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1)" You can clearly see which manifest variables are turned on or off. You do have to count across to determine which variable is being referenced. But I think it is preferable to naming the manifest variables and losing track of which which variables are turned on or off in such a long string of variable names. > If I'm even remotely correct, this could take me a long, long time to > write all possible combinations! I assume a macro might work here. > I'll have to look into using one--new territory for me. > > Best, > > Ryan > > p.s. I plan on taking your advice and using underscores in > the actual code. > Here is my take on what you really want as a solution to problem 1. Let's take a step back to the two latent class model from three manifest variables. The probabilities of each latent class and manifest variable combination are computed as follows: Latent Class Xvec 1 2 000 eta1*(1-pi11)*(1-pi21)*(1-pi31) eta2*(1-pi12)*(1-pi22)*(1-pi32) 001 eta1*(1-pi11)*(1-pi21)*(pi31) eta2*(1-pi12)*(1-pi22)*(pi32) 010 eta1*(1-pi11)*(pi21)*(1-pi31) eta2*(1-pi12)*(pi22)*(1-pi32) 011 eta1*(1-pi11)*(pi21)*(pi31) eta2*(1-pi12)*(pi22)*(pi32) ___________________________________________________________________ 100 | eta1*(pi11)*(1-pi21)*(1-pi31) eta2*(pi12)*(1-pi22)*(1-pi32) | 101 | eta1*(pi11)*(1-pi21)*(pi31) eta2*(pi12)*(1-pi22)*(pi32) | 110 | eta1*(pi11)*(pi21)*(1-pi31) eta2*(pi12)*(pi22)*(1-pi32) | 111 | eta1*(pi11)*(pi21)*(pi31) eta2*(pi12)*(pi22)*(pi32) | ------------------------------------------------------------------- So, if we are interested in the probability that LC=1 when X1=1, we are interested in the ratio of the sum of all LC=1 probabilities in the boxed area above to the sum of all probabilities in the boxed area. Thus, we would have P(LC=1 | x1=1) = (eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) + eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) ) / (eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) + eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) + eta2*(pi12)*(pi22)*(1-pi32) + eta2*(pi12)*(pi22)*(1-pi32) + eta2*(pi12)*(1-pi22)*(pi32) + eta2*(pi12)*(pi22)*(pi32) ) Guess what? Your problem only got bigger! Since there are 2^m combinations of manifest variables for each latent class and you need half of those (2^(m-1)) in the numerator and C*(2^(m-1)) in the denominator, you really need that macro code now to loop over all of the different probabilities. (You are probably ready now to shoot the messenger! But really, I am just trying to help!) I would note that there is yet another item which you probably would like to have. It would be desirable to know the estimated probability for a particular manifest variable combination given the number of latent classes in your model. This would allow you to assess whether your latent class model is providing a satisfactory fit to the observed data. (For more on this, see John Uebersax's web page on latent class analysis.) For our small problem with only two latent classes and three manifest variables, the probability of each manifest variable combination is obtained by summing across the rows of the table shown above. Dale --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra(a)NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------
From: Ryan on 4 Dec 2009 20:08 On Dec 4, 1:37 pm, stringplaye...(a)YAHOO.COM (Dale McLerran) wrote: > --- On Thu, 12/3/09, Ryan <ryan.andrew.bl...(a)GMAIL.COM> wrote: > > > > > > > From: Ryan <ryan.andrew.bl...(a)GMAIL.COM> > > Subject: Re: Latent Class Analysis via NLMIXED - UPDATE > > To: SA...(a)LISTSERV.UGA.EDU > > Date: Thursday, December 3, 2009, 11:57 PM > > HiDale, > > > When I tried to run the model late last night, I realized immediately > > that the REs component was misspecified. Thanks for the correct > > (assuming independent correlations) specification. I have a follow-up > > question, which I promise won't be more than a few lines of code. :) > > > When I think about the LCA, I think about wanting at least two > > outputs: > > > (1) probability that a (+) response on an item is associated with > > each latent class > > (2) probability that each observation is associated with each > > latent class > > > I believe you solved problem (2) with the Estimate statements in > > the previous thread, which was based on an example of 3 manifest > > dichotomous variables and 2 latent classes. I'd like to try to > > solve for at least one combination of responses to all items > > for my model (assuming 19 dichotomous manifest variables and 6 > > latent classes) to see if I'm doing this correctly. > > > Here goes nothing... > > > /*---------------------------------------------------------------*/ > > /* Calculate the probabililty that the observation belongs to */ > > /* latent class 1, given a (-) response on all items except the */ > > /* final 19th item */ > > /*---------------------------------------------------------------*/ > > estimate "P(LC=1|x1=0,x2=0,x3=0,...,x19=1)" > > (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191)) / > > (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191) + > > eta2*(1-pi12)*(1-pi22)*(1-pi32)*...*(pi192) + > > eta3*(1-pi13)*(1-pi23)*(1-pi33)*...*(pi193) + > > eta4*(1-pi14)*(1-pi24)*(1-pi34)*...*(pi194) + > > eta5*(1-pi15)*(1-pi25)*(1-pi35)*...*(pi195) + > > eta6*(1-pi16)*(1-pi26)*(1-pi36)*...*(pi196)); > > > Does that seem correct to you? > > This seems to be a solution to problem #2 at the top of your > post: Given a particular manifest variable combination, what > is the probability of the observation belonging in latent > class 1. For problem #2, your code is correct. However, that > is, as you noted at the top, something which I had already > addressed in a post yesterday. > > My understanding of what you want as a solution to problem > #1 is as follows: Suppose we only observed X19=1. What is > the probability that an observation with X19=1 belongs to > latent class 1? For the solution to that problem, see further > down in this post. > > But first, just a brief comment on the label part of your > ESTIMATE statement. With an X vector of length 19, I would > write the label part of the ESTIMATE statement (the quoted > part) as: > > estimate "P(LC=1|x=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1)" > > You can clearly see which manifest variables are turned on > or off. You do have to count across to determine which > variable is being referenced. But I think it is preferable > to naming the manifest variables and losing track of which > which variables are turned on or off in such a long string > of variable names. > > > If I'm even remotely correct, this could take me a long, long time to > > write all possible combinations! I assume a macro might work here. > > I'll have to look into using one--new territory for me. > > > Best, > > > Ryan > > > p.s. I plan on taking your advice and using underscores in > > the actual code. > > Here is my take on what you really want as a solution to problem > 1. Let's take a step back to the two latent class model from > three manifest variables. The probabilities of each latent > class and manifest variable combination are computed as follows: > > Latent Class > Xvec 1 2 > 000 eta1*(1-pi11)*(1-pi21)*(1-pi31) eta2*(1-pi12)*(1-pi22)*(1-pi32) > 001 eta1*(1-pi11)*(1-pi21)*(pi31) eta2*(1-pi12)*(1-pi22)*(pi32) > 010 eta1*(1-pi11)*(pi21)*(1-pi31) eta2*(1-pi12)*(pi22)*(1-pi32) > 011 eta1*(1-pi11)*(pi21)*(pi31) eta2*(1-pi12)*(pi22)*(pi32) > ___________________________________________________________________ > 100 | eta1*(pi11)*(1-pi21)*(1-pi31) eta2*(pi12)*(1-pi22)*(1-pi32) | > 101 | eta1*(pi11)*(1-pi21)*(pi31) eta2*(pi12)*(1-pi22)*(pi32) | > 110 | eta1*(pi11)*(pi21)*(1-pi31) eta2*(pi12)*(pi22)*(1-pi32) | > 111 | eta1*(pi11)*(pi21)*(pi31) eta2*(pi12)*(pi22)*(pi32) | > ------------------------------------------------------------------- > > So, if we are interested in the probability that LC=1 when > X1=1, we are interested in the ratio of the sum of all LC=1 > probabilities in the boxed area above to the sum of all > probabilities in the boxed area. Thus, we would have > > P(LC=1 | x1=1) = > (eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) + > eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) ) > > / > > (eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) + > eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) + > > eta2*(pi12)*(pi22)*(1-pi32) + eta2*(pi12)*(pi22)*(1-pi32) + > eta2*(pi12)*(1-pi22)*(pi32) + eta2*(pi12)*(pi22)*(pi32) ) > > Guess what? Your problem only got bigger! Since there are 2^m > combinations of manifest variables for each latent class and > you need half of those (2^(m-1)) in the numerator and C*(2^(m-1)) > in the denominator, you really need that macro code now to loop > over all of the different probabilities. (You are probably > ready now to shoot the messenger! But really, I am just trying > to help!) > > I would note that there is yet another item which you probably > would like to have. It would be desirable to know the estimated > probability for a particular manifest variable combination given > the number of latent classes in your model. This would allow > you to assess whether your latent class model is providing a > satisfactory fit to the observed data. (For more on this, see > John Uebersax's web page on latent class analysis.) > > For our small problem with only two latent classes and three > manifest variables, the probability of each manifest variable > combination is obtained by summing across the rows of the > table shown above. > > Dale > > ---------------------------------------DaleMcLerran > Fred Hutchinson Cancer Research Center > mailto: dmclerra(a)NO_SPAMfhcrc.org > Ph: (206) 667-2926 > Fax: (206) 667-5977 > ---------------------------------------- Hide quoted text - > > - Show quoted text - Dale, By the time I write out all of the ESTIMATE statements in their full glory, say 20 years or so from present day, the statistical progammers at SAS will have enhanced the GLIMMIX procedure to handle a random effects LCA model with just a click of a button! Seriously, it was very kind of you to answer all of my questions. You've given me much to consider, and as always, I've learned a tremendous amount. If I decide to go with the nlmixed procedure for this model, which is possible now that you've provided me with all this info, I will write back with an update. Take care, Ryan
First
|
Prev
|
Pages: 1 2 Prev: Test exercises for someone learning SAS Next: "ODBC engine cannot be found" |