Latent Class Analysis - Question [SAS]

Prev: Increase your Statistics IQ in the New Year through interactive
Next: Data Validation/Cleansing Tool Query

From: oloolo on 4 Jan 2010 10:38

Ryan
If you want to incorporate random effects into ur posterior estimation,
apply a DATA STEP to get subject specific estimates, and then employ the
formula in the ESTIMATE statement of Dale's code in the DATA STEP.

On Sat, 5 Dec 2009 11:14:45 -0800, Ryan <ryan.andrew.black(a)GMAIL.COM> wrote:
>
>Hey Dale,
>
>I decided to run the fixed effects LCA model through the nlmixed
>procedure using the code you presented in this thread. I then ran what
>I think is the same model on the same data in the demo version of the
>software program, "Latent Gold." Note that the data I used was
>obtained from an example data set provided in the demo version of
>Latent Gold. Anyway, the bottom line is that Latent Gold yielded
>pretty similar results to the results from the code you developed
>using the nlmixed procedure. I figured this might of interest to you.
>
>I also wanted to compare results from the same model with the
>inclusion of the random effects, but when I tried to run the random
>effects model including the ESTIMATE statements in the nlmixed
>procedure, I received an error that the "ESTIMATE statement
>expressions are not allowed to be dependent on the random effects." As
>a result, I was not able to validate results from the random effects
>model.
>
>For those interested, the demo version of the Latent Gold program can
>be downloaded here:
>
>http://www.statisticalinnovations.com/products/latentgold_v4.html
>
>Best,
>
>Ryan

From: Dale McLerran on 4 Jan 2010 13:21

Not specified in oloolo's response is how the random effects
which are to be employed in the data step are determined. The
raison d'etre of the NLMIXED procedure is to enable computation
of the random effects. In order to specify appropriate random
effects in the data step, the random effects must be obtained
from NLMIXED.

The way to obtain the random effect estimates from the NLMIXED
procedure is through a PREDICT statement. Note that the PREDICT
statement operates much like the ESTIMATE statement. However,
the ESTIMATE statement is restricted to functions of model
fixed effects only whereas the PREDICT statement allows
functions which include random effects. A further difference
between estimates generated by an ESTIMATE statement and
estimates generated employing the PREDICT statement is that
PREDICT statement estimates must be written to a data set.
Estimates generated through the PREDICT statement cannot be
displayed during execution of the NLMIXED procedure.

Now, one could employ a PREDICT statement to write out just
the random effect estimates. The approach specified by oloolo
could then be used to obtain posterior estimates which
incorporate the random effects. But this approach would not
allow one to construct the variance of the function which
is generated. If the variance of the estimate is needed (as
it usually is), then it is preferable to obtain the results
employing multiple predict statements.

Dale

---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra(a)NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------

--- On Mon, 1/4/10, oloolo <dynamicpanel(a)YAHOO.COM> wrote:

> From: oloolo <dynamicpanel(a)YAHOO.COM>
> Subject: Re: Latent Class Analysis - Question
> To: SAS-L(a)LISTSERV.UGA.EDU
> Date: Monday, January 4, 2010, 7:38 AM
> Ryan
> If you want to incorporate random effects into ur posterior estimation,
> apply a DATA STEP to get subject specific estimates, and then employ the
> formula in the ESTIMATE statement of Dale's code in the DATA STEP.
>
> On Sat, 5 Dec 2009 11:14:45 -0800, Ryan <ryan.andrew.black(a)GMAIL.COM>
> wrote:
> >
> >Hey Dale,
> >
> >I decided to run the fixed effects LCA model through the nlmixed
> >procedure using the code you presented in this thread. I then ran what
> >I think is the same model on the same data in the demo version of the
> >software program, "Latent Gold." Note that the data I used was
> >obtained from an example data set provided in the demo version of
> >Latent Gold. Anyway, the bottom line is that Latent Gold yielded
> >pretty similar results to the results from the code you developed
> >using the nlmixed procedure. I figured this might of interest to you.
> >
> >I also wanted to compare results from the same model with the
> >inclusion of the random effects, but when I tried to run the random
> >effects model including the ESTIMATE statements in the nlmixed
> >procedure, I received an error that the "ESTIMATE statement
> >expressions are not allowed to be dependent on the random effects." As
> >a result, I was not able to validate results from the random effects
> >model.
> >
> >For those interested, the demo version of the Latent Gold program can
> >be downloaded here:
> >
> >http://www.statisticalinnovations.com/products/latentgold_v4.html
> >
> >Best,
> >
> >Ryan
>

From: oloolo on 5 Jan 2010 07:18

nice input, as always

however, I noticed one potential problem in evaluating the joint prob of,
say, subject i in state j. Since, SAS doesn't have multivariate normal
density function in STAT, BASE, we have to do it manually, i.e. via a
proprietary cholesky decomposition of the estimated variance-cov matrix of
suject i.

Given this complexity, I will definitely come up a macro to loop over the
iteration, with two steps optimize over weighting prob parsameters and
other paramters in the mean profile and cov-var matrix, so that it is more
like a EM algorithm. After all M-step, a cholesky decomp will be imposed in
a DATA STEP and joint prob will be evaluated one-by-one based on cholesky
decomp output.

any thoughts?

On Mon, 4 Jan 2010 10:21:10 -0800, Dale McLerran <stringplayer_2(a)YAHOO.COM>
wrote:

>Not specified in oloolo's response is how the random effects
>which are to be employed in the data step are determined. The
>raison d'etre of the NLMIXED procedure is to enable computation
>of the random effects. In order to specify appropriate random
>effects in the data step, the random effects must be obtained
>from NLMIXED.
>
>The way to obtain the random effect estimates from the NLMIXED
>procedure is through a PREDICT statement. Note that the PREDICT
>statement operates much like the ESTIMATE statement. However,
>the ESTIMATE statement is restricted to functions of model
>fixed effects only whereas the PREDICT statement allows
>functions which include random effects. A further difference
>between estimates generated by an ESTIMATE statement and
>estimates generated employing the PREDICT statement is that
>PREDICT statement estimates must be written to a data set.
>Estimates generated through the PREDICT statement cannot be
>displayed during execution of the NLMIXED procedure.
>
>Now, one could employ a PREDICT statement to write out just
>the random effect estimates. The approach specified by oloolo
>could then be used to obtain posterior estimates which
>incorporate the random effects. But this approach would not
>allow one to construct the variance of the function which
>is generated. If the variance of the estimate is needed (as
>it usually is), then it is preferable to obtain the results
>employing multiple predict statements.
>
>Dale
>
>---------------------------------------
>Dale McLerran
>Fred Hutchinson Cancer Research Center
>mailto: dmclerra(a)NO_SPAMfhcrc.org
>Ph: (206) 667-2926
>Fax: (206) 667-5977
>---------------------------------------
>

From: Dale McLerran on 5 Jan 2010 12:50

Well, I suppose it could be done. However, it does not seem
to me that the SAS data step is the appropriate tool for what
is proposed. The IML procedure would be better suited for the
proposed processing.

All the same, I would prefer to use the NLMIXED procedure. It
is possible to declare a vector of random effects which are
assumed to be normally distributed and have a non-zero covariance
structure. For most problems, there should not be any difficulty
evaluating the probability for subject i in latent class j. If
the number of latent classes (with a subject-specific random
effect for each latent class) becomes too large (on the order
of a dozen or so random effects), then the NLMIXED procedure
can run into problems.

Parameterizing the covariance in terms of a Cholesky decomp
is a very good idea. This can be done within NLMIXED.
Specifically, since the Cholesky decomposition L of a positive
semidefinite matrix has the properties that L is a lower
diagonal matrix and L*L' = Cov, then the covariance matrix
for a covariance matrix of dimension 3 can be expressed in
NLMIXED code as:

_ _ _ _
| L11 0 0 | | L11 L21 L31 |
Cov = | L21 L22 0 | * | 0 L22 L32 |
| L31 L32 L33 | | 0 0 L33 |
- - - -

_ _
| L11^2 L21*L11 L31*L11 |
= | L21*L11 L21^2 + L22^2 L31*L21 + L32*L22 |
| L31*L11 L31*L21 + L32*L22 L31^2 + L32^2 + L33^2 |
- -

Employing Lij (i>=j) as parameters, we can express the
covariance matrix as shown above. By parameterizing the
covariance matrix in terms of a Cholesky decomposition of
the covariance matrix, we are guaranteed that the random
effects covariance matrix will have an estimate which is
positive semidefinite. Thus, everything that you intend
to do with data step code including the Cholesky decomp of
the random effects covariance matrix can be done employing
NLMIXED code.

Dale

---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra(a)NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------

--- On Tue, 1/5/10, oloolo <dynamicpanel(a)YAHOO.COM> wrote:

> From: oloolo <dynamicpanel(a)YAHOO.COM>
> Subject: Re: Latent Class Analysis - Question
> To: SAS-L(a)LISTSERV.UGA.EDU
> Date: Tuesday, January 5, 2010, 4:18 AM
> nice input, as always
>
> however, I noticed one potential problem in evaluating the
> joint prob of,
> say, subject i in state j. Since, SAS doesn't have
> multivariate normal
> density function in STAT, BASE, we have to do it manually,
> i.e. via a
> proprietary cholesky decomposition of the estimated
> variance-cov matrix of
> suject i.
>
> Given this complexity, I will definitely come up a macro to
> loop over the
> iteration, with two steps optimize over weighting prob
> parsameters and
> other paramters in the mean profile and cov-var matrix, so
> that it is more
> like a EM algorithm. After all M-step, a cholesky decomp
> will be imposed in
> a DATA STEP and joint prob will be evaluated one-by-one
> based on cholesky
> decomp output.
>
> any thoughts?
>
>
> On Mon, 4 Jan 2010 10:21:10 -0800, Dale McLerran <stringplayer_2(a)YAHOO.COM>
> wrote:
>
> >Not specified in oloolo's response is how the random
> effects
> >which are to be employed in the data step are
> determined. The
> >raison d'etre of the NLMIXED procedure is to enable
> computation
> >of the random effects. In order to specify
> appropriate random
> >effects in the data step, the random effects must be
> obtained
> >from NLMIXED.
> >
> >The way to obtain the random effect estimates from the
> NLMIXED
> >procedure is through a PREDICT statement. Note
> that the PREDICT
> >statement operates much like the ESTIMATE
> statement. However,
> >the ESTIMATE statement is restricted to functions of
> model
> >fixed effects only whereas the PREDICT statement
> allows
> >functions which include random effects. A further
> difference
> >between estimates generated by an ESTIMATE statement
> and
> >estimates generated employing the PREDICT statement is
> that
> >PREDICT statement estimates must be written to a data
> set.
> >Estimates generated through the PREDICT statement
> cannot be
> >displayed during execution of the NLMIXED procedure.
> >
> >Now, one could employ a PREDICT statement to write out
> just
> >the random effect estimates. The approach
> specified by oloolo
> >could then be used to obtain posterior estimates which
> >incorporate the random effects. But this approach
> would not
> >allow one to construct the variance of the function
> which
> >is generated. If the variance of the estimate is
> needed (as
> >it usually is), then it is preferable to obtain the
> results
> >employing multiple predict statements.
> >
> >Dale
> >
> >---------------------------------------
> >Dale McLerran
> >Fred Hutchinson Cancer Research Center
> >mailto: dmclerra(a)NO_SPAMfhcrc.org
> >Ph: (206) 667-2926
> >Fax: (206) 667-5977
> >---------------------------------------
> >
>

|
Pages: 1
Prev: Increase your Statistics IQ in the New Year through interactive
Next: Data Validation/Cleansing Tool Query