LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2009, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 4 Dec 2009 17:08:11 -0800
Reply-To:   Ryan <ryan.andrew.black@GMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Ryan <ryan.andrew.black@GMAIL.COM>
Organization:   http://groups.google.com
Subject:   Re: Latent Class Analysis via NLMIXED - UPDATE
Comments:   To: sas-l@uga.edu
Content-Type:   text/plain; charset=ISO-8859-1

On Dec 4, 1:37 pm, stringplaye...@YAHOO.COM (Dale McLerran) wrote: > --- On Thu, 12/3/09, Ryan <ryan.andrew.bl...@GMAIL.COM> wrote: > > > > > > > From: Ryan <ryan.andrew.bl...@GMAIL.COM> > > Subject: Re: Latent Class Analysis via NLMIXED - UPDATE > > To: SA...@LISTSERV.UGA.EDU > > Date: Thursday, December 3, 2009, 11:57 PM > > HiDale, > > > When I tried to run the model late last night, I realized immediately > > that the REs component was misspecified. Thanks for the correct > > (assuming independent correlations) specification. I have a follow-up > > question, which I promise won't be more than a few lines of code. :) > > > When I think about the LCA, I think about wanting at least two > > outputs: > > > (1) probability that a (+) response on an item is associated with > > each latent class > > (2) probability that each observation is associated with each > > latent class > > > I believe you solved problem (2) with the Estimate statements in > > the previous thread, which was based on an example of 3 manifest > > dichotomous variables and 2 latent classes. I'd like to try to > > solve for at least one combination of responses to all items > > for my model (assuming 19 dichotomous manifest variables and 6 > > latent classes) to see if I'm doing this correctly. > > > Here goes nothing... > > > /*---------------------------------------------------------------*/ > > /* Calculate the probabililty that the observation belongs to */ > > /* latent class 1, given a (-) response on all items except the */ > > /* final 19th item */ > > /*---------------------------------------------------------------*/ > > estimate "P(LC=1|x1=0,x2=0,x3=0,...,x19=1)" > > (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191)) / > > (eta1*(1-pi11)*(1-pi21)*(1-pi31)*...*(pi191) + > > eta2*(1-pi12)*(1-pi22)*(1-pi32)*...*(pi192) + > > eta3*(1-pi13)*(1-pi23)*(1-pi33)*...*(pi193) + > > eta4*(1-pi14)*(1-pi24)*(1-pi34)*...*(pi194) + > > eta5*(1-pi15)*(1-pi25)*(1-pi35)*...*(pi195) + > > eta6*(1-pi16)*(1-pi26)*(1-pi36)*...*(pi196)); > > > Does that seem correct to you? > > This seems to be a solution to problem #2 at the top of your > post: Given a particular manifest variable combination, what > is the probability of the observation belonging in latent > class 1. For problem #2, your code is correct. However, that > is, as you noted at the top, something which I had already > addressed in a post yesterday. > > My understanding of what you want as a solution to problem > #1 is as follows: Suppose we only observed X19=1. What is > the probability that an observation with X19=1 belongs to > latent class 1? For the solution to that problem, see further > down in this post. > > But first, just a brief comment on the label part of your > ESTIMATE statement. With an X vector of length 19, I would > write the label part of the ESTIMATE statement (the quoted > part) as: > > estimate "P(LC=1|x=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1)" > > You can clearly see which manifest variables are turned on > or off. You do have to count across to determine which > variable is being referenced. But I think it is preferable > to naming the manifest variables and losing track of which > which variables are turned on or off in such a long string > of variable names. > > > If I'm even remotely correct, this could take me a long, long time to > > write all possible combinations! I assume a macro might work here. > > I'll have to look into using one--new territory for me. > > > Best, > > > Ryan > > > p.s. I plan on taking your advice and using underscores in > > the actual code. > > Here is my take on what you really want as a solution to problem > 1. Let's take a step back to the two latent class model from > three manifest variables. The probabilities of each latent > class and manifest variable combination are computed as follows: > > Latent Class > Xvec 1 2 > 000 eta1*(1-pi11)*(1-pi21)*(1-pi31) eta2*(1-pi12)*(1-pi22)*(1-pi32) > 001 eta1*(1-pi11)*(1-pi21)*(pi31) eta2*(1-pi12)*(1-pi22)*(pi32) > 010 eta1*(1-pi11)*(pi21)*(1-pi31) eta2*(1-pi12)*(pi22)*(1-pi32) > 011 eta1*(1-pi11)*(pi21)*(pi31) eta2*(1-pi12)*(pi22)*(pi32) > ___________________________________________________________________ > 100 | eta1*(pi11)*(1-pi21)*(1-pi31) eta2*(pi12)*(1-pi22)*(1-pi32) | > 101 | eta1*(pi11)*(1-pi21)*(pi31) eta2*(pi12)*(1-pi22)*(pi32) | > 110 | eta1*(pi11)*(pi21)*(1-pi31) eta2*(pi12)*(pi22)*(1-pi32) | > 111 | eta1*(pi11)*(pi21)*(pi31) eta2*(pi12)*(pi22)*(pi32) | > ------------------------------------------------------------------- > > So, if we are interested in the probability that LC=1 when > X1=1, we are interested in the ratio of the sum of all LC=1 > probabilities in the boxed area above to the sum of all > probabilities in the boxed area. Thus, we would have > > P(LC=1 | x1=1) = > (eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) + > eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) ) > > / > > (eta1*(pi11)*(1-pi21)*(1-pi31) + eta1*(pi11)*(1-pi21)*(pi31) + > eta1*(pi11)*(pi21)*(1-pi31) + eta1*(pi11)*(pi21)*(pi31) + > > eta2*(pi12)*(pi22)*(1-pi32) + eta2*(pi12)*(pi22)*(1-pi32) + > eta2*(pi12)*(1-pi22)*(pi32) + eta2*(pi12)*(pi22)*(pi32) ) > > Guess what? Your problem only got bigger! Since there are 2^m > combinations of manifest variables for each latent class and > you need half of those (2^(m-1)) in the numerator and C*(2^(m-1)) > in the denominator, you really need that macro code now to loop > over all of the different probabilities. (You are probably > ready now to shoot the messenger! But really, I am just trying > to help!) > > I would note that there is yet another item which you probably > would like to have. It would be desirable to know the estimated > probability for a particular manifest variable combination given > the number of latent classes in your model. This would allow > you to assess whether your latent class model is providing a > satisfactory fit to the observed data. (For more on this, see > John Uebersax's web page on latent class analysis.) > > For our small problem with only two latent classes and three > manifest variables, the probability of each manifest variable > combination is obtained by summing across the rows of the > table shown above. > > Dale > > ---------------------------------------DaleMcLerran > Fred Hutchinson Cancer Research Center > mailto: dmclerra@NO_SPAMfhcrc.org > Ph: (206) 667-2926 > Fax: (206) 667-5977 > ---------------------------------------- Hide quoted text - > > - Show quoted text -

Dale,

By the time I write out all of the ESTIMATE statements in their full glory, say 20 years or so from present day, the statistical progammers at SAS will have enhanced the GLIMMIX procedure to handle a random effects LCA model with just a click of a button!

Seriously, it was very kind of you to answer all of my questions. You've given me much to consider, and as always, I've learned a tremendous amount. If I decide to go with the nlmixed procedure for this model, which is possible now that you've provided me with all this info, I will write back with an update.

Take care,

Ryan


Back to: Top of message | Previous page | Main SAS-L page