LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2007, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 30 Aug 2007 15:04:07 -0500
Reply-To:     Mary <mlhoward@avalon.net>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mary <mlhoward@AVALON.NET>
Subject:      Re: AIC versus c statistic in Proc Logistic
Comments: To: Bads <badrish.prakash@GMAIL.COM>
Content-Type: text/plain; charset="iso-8859-1"

Badrish,

Unfortunately, the variables are SNPs from Genes- not like other variables where we have an intuitive sense of which is appropriate for the model. I've got 3200 of them :-).

I'm a SAS programmer, but DID take 30 credit hours in college of GRADUATE level statistics (didn't quite finish a Masters degree in Statistics, was about 2 courses away when I took a job) , worked as the SAS consultant at universities for 10 years, and just finished 6 SAS courses in the statistics curriculum ($10,000 retail price). I have a masters degree in MIS.

So where does that put me, as a SAS programmer, or as not quite a statistician but closer to one than most SAS programmers?

I'll look for the articles as I have time.

Thanks.

-Mary

----- Original Message ----- From: Bads To: SAS-L@LISTSERV.UGA.EDU Sent: Thursday, August 30, 2007 2:48 PM Subject: Re: AIC versus c statistic in Proc Logistic

Essentially, all models are wrong, but some are useful - George Box

In your question of the model selection, try not to lose focus on the objective variable y and the relevance of all the x's. The AIC, concordance / c-stats, wald chi-square just _hint_ towards an optimal solution. Ultimately, its the modelers' choice that prevails.

This Friday, please hunt for the posts of Statistician vs SAS Programmers. (It wouldn't help in your model building, but is guaranteed to keep you smiling throughout the weekend)

~Badrish

On Aug 30, 10:55 am, davidlcass...@MSN.COM (David L Cassell) wrote: > mlhow...@avalon.net wrote: > > >Hi, > > >I'm working on Proc Logistic and have a question about the AIC statistic > >versus the c statistic. > > >I did take the SAS Categorical Data Analysis class, and was taught there to > >look for the smallest AIC statistic, and the largest c statistic. > > But you can't count on them to agree. Even if you focus on, say, two > similar measures which address the same thing, say AIC and SBC (both of > which look at information theoretic concepts), you cannot count on both > of them giving you the same model. > > >I'm now comparing two models: > > >1. AIC= 983.21, c statistic= .74 > >2. AIC= 940.17, c statistic= .757 > > Here, the models are so close that if you plotted these with your other > model results that you might not be able to tell which was better from > your graph. > > >Model 2 has the same variables as model 1, with one additional variable, > >whose type 3 effect is significant at .01. > > >My question is, why don't I see a bigger jump in the c statistic (about a > >2% rise) when the AIC statisitic is dropping substantially (about a 5% > >drop)? > > >Mary > > I don't see a big change in either. I would personally say that either > model > would be an acceptable choice, given these results, so I would pick the > one that makes more sense from a subject matter POV. > > But first I would check things like missing values (to see if one > statistic's effect > is driven by changes in the number of records used), and diagnostic plots > (to > see if there are problems that I need to address before I consider showing > either model to anyone). I'm Obsessive-Compulsive like that. :-) > > HTH, > David > -- > David L. Cassell > mathematical statistician > Design Pathways > 3115 NW Norwood Pl. > Corvallis OR 97330 >


Back to: Top of message | Previous page | Main SAS-L page