LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (April 2006, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 13 Apr 2006 21:12:27 -0700
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: Logistic regression w/survey data
In-Reply-To:  <200604131442.k3DB1EY5010575@malibu.cc.uga.edu>
Content-Type: text/plain; format=flowed

axnjxntx@YAHOO.COM wrote: >I have a nationally representative survey data set, >with which I am trying to do a logistic regression. > >I know to use the survey procs - I normally use SUDAAN >(RLOGIST in SAS-callable SUDAAN), but I am trying to >figure out and become more comfortable/familiar with >SURVEYLOGISTIC.

Good so far...

>My question/issue is...the tests & fit statistics that >are produced with SURVEYLOGISTIC seem to be off. The >numbers/values seem to be too big. I'm wondering if >this has to do with the weights. (I'll show examples >below).

I can't tell from way over here, but I doubt it. I typically see *really* close matches between RLOGIST and SURVEYLOGISTIC.

But there may be something wrong with your code.

>I have tried scaling the weights - dividing each >observation's weight by the mean of all the weights. >The numbers I get for the tests & fit statistics are >still big, but maybe more reasonable (again, examples >below).

That's a BAD thing. Never 'scale' the weights. If you divide each weight by the mean of all the weights, then you have something nonsensical for your values like the relative weight times n. That's double plus ungood. Weights are real quantities with real, physical meaning. 'Scaling' them is usually a step toward ruin.

>I'm wondering if it's OK to scale the weights like >this when using survey data. All other info - beta >estimates, ORs, CIs, p-values - seemed to be the same. > I just noticed the difference in the fit/test >statistics.

Please don't scale the weights. I don't care if you scale analytic weights when working with weights from experimental designs. Those weights are scalable. But sampling weights are *not* scalable.

>Also - should I even use/trust the results from these >fit/test statistics? >I have tried looking at this before with a different, >nationally representative survey. I got similar >results. >I may also have to run a cumulative logit &/or ordinal >model. I didn't know if I'd be able to use the >results from the proportional odds test. I have >played with this in the past, but again, the >chi-square values I got using regular weights seemed >to be very large!

As I said before, I have no idea where the problem lies. But, based on my experiences, I would guess the problem is the way you have defined the design effects. If you are using the same model and the same weights as in RLOGIST but coming up with drastically different results, then there may be some misunderstanding about the rest of the statements in the proc.

>Here are the examples of the fit / test stats I got. >This is for the very first run of the model, and I >know it definitely is not the final model!!! > >REGULAR WEIGHTS: > Model Fit Statistics > > Intercept > Intercept and >Criterion Only Covariates > >AIC 3403773.0 1698265.4 >SC 3403778.8 1698363.2 >-2 Log L 3403771.0 1698231.4 > > >R-Square 1.0000 Max-rescaled R-Square 1.0000 > > > Testing Global Null Hypothesis: BETA=0 > >Test Chi-Square DF Pr > >ChiSq > >Likelihood Ratio 1705539.60 16 ><.0001 >Score 1424442.49 16 ><.0001 >Wald 690.5363 16 ><.0001 > > > >SCALED WEIGHTS: > Model Fit Statistics > > Intercept > Intercept and >Criterion Only Covariates > >AIC 3725.964 1891.984 >SC 3731.718 1989.810 >-2 Log L 3723.964 1857.984 > > >R-Square 0.5507 Max-rescaled R-Square 0.6906 > > > Testing Global Null Hypothesis: BETA=0 > >Test Chi-Square DF Pr > >ChiSq > >Likelihood Ratio 1865.9797 16 ><.0001 >Score 1558.4398 16 ><.0001 >Wald 690.5363 16 ><.0001

Those R-squares (really, squared coefficients of determination) of 1 are just freaky. Is there something odd about your data?

I think you need to write back to SAS-L (not to me personally) and tell us a bit more.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Express yourself instantly with MSN Messenger! Download today - it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


Back to: Top of message | Previous page | Main SAS-L page