LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2009, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 4 Dec 2009 10:37:45 -0800
Reply-To:   Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:   Re: Latent Class Analysis via NLMIXED - UPDATE
In-Reply-To:   <238ea1f6-94d6-4f73-9e31-cc21ebfc26c1@a21g2000yqc.googlegroups.com>
Content-Type:   text/plain; charset=iso-8859-1

--- On Thu, 12/3/09, Ryan <ryan.andrew.black@GMAIL.COM> wrote:

> From: Ryan <ryan.andrew.black@GMAIL.COM> > Subject: Re: Latent Class Analysis via NLMIXED - UPDATE > To: SAS-L@LISTSERV.UGA.EDU > Date: Thursday, December 3, 2009, 11:57 PM > Hi Dale, > > When I tried to run the model late last night, I realized immediately > that the REs component was misspecified. Thanks for the correct > (assuming independent correlations) specification. I have a follow-up > question, which I promise won't be more than a few lines of code. :) > > When I think about the LCA, I think about wanting at least two > outputs: > > (1) probability that a (+) response on an item is associated with > each latent class > (2) probability that each observation is associated with each > latent class > > I believe you solved problem (2) with the Estimate statements in > the previous thread, which was based on an example of 3 manifest > dichotomous variables and 2 latent classes. I'd like to try to > solve for at least one combination of responses to all items > for my model (assuming 19 dichotomous manifest variables and 6 > latent classes) to see if I'm doing this correctly. > > Here goes nothing... > > /*---------------------------------------------------------------*/ > /* Calculate the probabililty that the observation belongs to */ > /* latent class 1, given a (-) response on all items except the */ > /* final 19th item */ > /*---------------------------------------------------------------*/ > estimate "P(LC=1|x1=0,x2=0,x3=0,...,x19=1)" > (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191)) / > (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191) + > eta2*(1-pi12)*(1-pi22)*(1-pi32)*...*(pi192) + > eta3*(1-pi13)*(1-pi23)*(1-pi33)*...*(pi193) + > eta4*(1-pi14)*(1-pi24)*(1-pi34)*...*(pi194) + > eta5*(1-pi15)*(1-pi25)*(1-pi35)*...*(pi195) + > eta6*(1-pi16)*(1-pi26)*(1-pi36)*...*(pi196)); > > > Does that seem correct to you? >

This seems to be a solution to problem #2 at the top of your post: Given a particular manifest variable combination, what is the probability of the observation belonging in latent class 1. For problem #2, your code is correct. However, that is, as you noted at the top, something which I had already addressed in a post yesterday.

My understanding of what you want as a solution to problem #1 is as follows: Suppose we only observed X19=1. What is the probability that an observation with X19=1 belongs to latent class 1? For the solution to that problem, see further down in this post.

But first, just a brief comment on the label part of your ESTIMATE statement. With an X vector of length 19, I would write the label part of the ESTIMATE statement (the quoted part) as:

estimate "P(LC=1|x=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1)"

You can clearly see which manifest variables are turned on or off. You do have to count across to determine which variable is being referenced. But I think it is preferable to naming the manifest variables and losing track of which which variables are turned on or off in such a long string of variable names.

> If I'm even remotely correct, this could take me a long, long time to > write all possible combinations! I assume a macro might work here. > I'll have to look into using one--new territory for me. > > Best, > > Ryan > > p.s. I plan on taking your advice and using underscores in > the actual code. >

Here is my take on what you really want as a solution to problem 1. Let's take a step back to the two latent class model from three manifest variables. The probabilities of each latent class and manifest variable combination are computed as follows:

Latent Class Xvec 1 2 000 eta1*(1-pi11)*(1-pi21)*(1-pi31) eta2*(1-pi12)*(1-pi22)*(1-pi32) 001 eta1*(1-pi11)*(1-pi21)*(pi31) eta2*(1-pi12)*(1-pi22)*(pi32) 010 eta1*(1-pi11)*(pi21)*(1-pi31) eta2*(1-pi12)*(pi22)*(1-pi32) 011 eta1*(1-pi11)*(pi21)*(pi31) eta2*(1-pi12)*(pi22)*(pi32) ___________________________________________________________________ 100 | eta1*(pi11)*(1-pi21)*(1-pi31) eta2*(pi12)*(1-pi22)*(1-pi32) | 101 | eta1*(pi11)*(1-pi21)*(pi31) eta2*(pi12)*(1-pi22)*(pi32) | 110 | eta1*(pi11)*(pi21)*(1-pi31) eta2*(pi12)*(pi22)*(1-pi32) | 111 | eta1*(pi11)*(pi21)*(pi31) eta2*(pi12)*(pi22)*(pi32) | -------------------------------------------------------------------

So, if we are interested in the probability that LC=1 when X1=1, we are interested in the ratio of the sum of all LC=1 probabilities in the boxed area above to the sum of all probabilities in the boxed area. Thus, we would have

P(LC=1 | x1=1) = (eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) + eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) )

/

(eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) + eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) +

eta2*(pi12)*(pi22)*(1-pi32) + eta2*(pi12)*(pi22)*(1-pi32) + eta2*(pi12)*(1-pi22)*(pi32) + eta2*(pi12)*(pi22)*(pi32) )

Guess what? Your problem only got bigger! Since there are 2^m combinations of manifest variables for each latent class and you need half of those (2^(m-1)) in the numerator and C*(2^(m-1)) in the denominator, you really need that macro code now to loop over all of the different probabilities. (You are probably ready now to shoot the messenger! But really, I am just trying to help!)

I would note that there is yet another item which you probably would like to have. It would be desirable to know the estimated probability for a particular manifest variable combination given the number of latent classes in your model. This would allow you to assess whether your latent class model is providing a satisfactory fit to the observed data. (For more on this, see John Uebersax's web page on latent class analysis.)

For our small problem with only two latent classes and three manifest variables, the probability of each manifest variable combination is obtained by summing across the rows of the table shown above.

Dale

--------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------


Back to: Top of message | Previous page | Main SAS-L page