|
--- On Thu, 12/3/09, Ryan <ryan.andrew.black@GMAIL.COM> wrote:
> From: Ryan <ryan.andrew.black@GMAIL.COM>
> Subject: Re: Latent Class Analysis via NLMIXED - UPDATE
> To: SAS-L@LISTSERV.UGA.EDU
> Date: Thursday, December 3, 2009, 11:57 PM
> Hi Dale,
>
> When I tried to run the model late last night, I realized immediately
> that the REs component was misspecified. Thanks for the correct
> (assuming independent correlations) specification. I have a follow-up
> question, which I promise won't be more than a few lines of code. :)
>
> When I think about the LCA, I think about wanting at least two
> outputs:
>
> (1) probability that a (+) response on an item is associated with
> each latent class
> (2) probability that each observation is associated with each
> latent class
>
> I believe you solved problem (2) with the Estimate statements in
> the previous thread, which was based on an example of 3 manifest
> dichotomous variables and 2 latent classes. I'd like to try to
> solve for at least one combination of responses to all items
> for my model (assuming 19 dichotomous manifest variables and 6
> latent classes) to see if I'm doing this correctly.
>
> Here goes nothing...
>
> /*---------------------------------------------------------------*/
> /* Calculate the probabililty that the observation belongs to */
> /* latent class 1, given a (-) response on all items except the */
> /* final 19th item */
> /*---------------------------------------------------------------*/
> estimate "P(LC=1|x1=0,x2=0,x3=0,...,x19=1)"
> (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191)) /
> (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191) +
> eta2*(1-pi12)*(1-pi22)*(1-pi32)*...*(pi192) +
> eta3*(1-pi13)*(1-pi23)*(1-pi33)*...*(pi193) +
> eta4*(1-pi14)*(1-pi24)*(1-pi34)*...*(pi194) +
> eta5*(1-pi15)*(1-pi25)*(1-pi35)*...*(pi195) +
> eta6*(1-pi16)*(1-pi26)*(1-pi36)*...*(pi196));
>
>
> Does that seem correct to you?
>
This seems to be a solution to problem #2 at the top of your
post: Given a particular manifest variable combination, what
is the probability of the observation belonging in latent
class 1. For problem #2, your code is correct. However, that
is, as you noted at the top, something which I had already
addressed in a post yesterday.
My understanding of what you want as a solution to problem
#1 is as follows: Suppose we only observed X19=1. What is
the probability that an observation with X19=1 belongs to
latent class 1? For the solution to that problem, see further
down in this post.
But first, just a brief comment on the label part of your
ESTIMATE statement. With an X vector of length 19, I would
write the label part of the ESTIMATE statement (the quoted
part) as:
estimate "P(LC=1|x=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1)"
You can clearly see which manifest variables are turned on
or off. You do have to count across to determine which
variable is being referenced. But I think it is preferable
to naming the manifest variables and losing track of which
which variables are turned on or off in such a long string
of variable names.
> If I'm even remotely correct, this could take me a long, long time to
> write all possible combinations! I assume a macro might work here.
> I'll have to look into using one--new territory for me.
>
> Best,
>
> Ryan
>
> p.s. I plan on taking your advice and using underscores in
> the actual code.
>
Here is my take on what you really want as a solution to problem
1. Let's take a step back to the two latent class model from
three manifest variables. The probabilities of each latent
class and manifest variable combination are computed as follows:
Latent Class
Xvec 1 2
000 eta1*(1-pi11)*(1-pi21)*(1-pi31) eta2*(1-pi12)*(1-pi22)*(1-pi32)
001 eta1*(1-pi11)*(1-pi21)*(pi31) eta2*(1-pi12)*(1-pi22)*(pi32)
010 eta1*(1-pi11)*(pi21)*(1-pi31) eta2*(1-pi12)*(pi22)*(1-pi32)
011 eta1*(1-pi11)*(pi21)*(pi31) eta2*(1-pi12)*(pi22)*(pi32)
___________________________________________________________________
100 | eta1*(pi11)*(1-pi21)*(1-pi31) eta2*(pi12)*(1-pi22)*(1-pi32) |
101 | eta1*(pi11)*(1-pi21)*(pi31) eta2*(pi12)*(1-pi22)*(pi32) |
110 | eta1*(pi11)*(pi21)*(1-pi31) eta2*(pi12)*(pi22)*(1-pi32) |
111 | eta1*(pi11)*(pi21)*(pi31) eta2*(pi12)*(pi22)*(pi32) |
-------------------------------------------------------------------
So, if we are interested in the probability that LC=1 when
X1=1, we are interested in the ratio of the sum of all LC=1
probabilities in the boxed area above to the sum of all
probabilities in the boxed area. Thus, we would have
P(LC=1 | x1=1) =
(eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) +
eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) )
/
(eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) +
eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) +
eta2*(pi12)*(pi22)*(1-pi32) + eta2*(pi12)*(pi22)*(1-pi32) +
eta2*(pi12)*(1-pi22)*(pi32) + eta2*(pi12)*(pi22)*(pi32) )
Guess what? Your problem only got bigger! Since there are 2^m
combinations of manifest variables for each latent class and
you need half of those (2^(m-1)) in the numerator and C*(2^(m-1))
in the denominator, you really need that macro code now to loop
over all of the different probabilities. (You are probably
ready now to shoot the messenger! But really, I am just trying
to help!)
I would note that there is yet another item which you probably
would like to have. It would be desirable to know the estimated
probability for a particular manifest variable combination given
the number of latent classes in your model. This would allow
you to assess whether your latent class model is providing a
satisfactory fit to the observed data. (For more on this, see
John Uebersax's web page on latent class analysis.)
For our small problem with only two latent classes and three
manifest variables, the probability of each manifest variable
combination is obtained by summing across the rows of the
table shown above.
Dale
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
|