|
Thanks very much, all, for your suggestions. I reran the original model and it comes much closer to the new model once the observations are the same (base model=944, additional variable model=940); I'll take a look at the values of the observations dropped as to why it changed so much.
-Mary
----- Original Message -----
From: Zack, Matthew M. (CDC/CCHP/NCCDPHP)
To: Mary
Sent: Wednesday, August 29, 2007 12:47 PM
Subject: RE: AIC versus c statistic in Proc Logistic
Cf., comments below.
Matthew Zack
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Mary
Sent: Wednesday, August 29, 2007 1:24 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: AIC versus c statistic in Proc Logistic
Kevin,
Thanks for your reply.
Yes, 3 additional observations were dropped in the analysis- the first
model used 876 of 890 observations, and the second with the additional
variable used 873 of 890.
>>Redo the first analysis with fewer variables but deleting the three
observations
>>with missing values of the additional variable (for example, use a
WHERE statement).
>>Then, the AIC statistic and the c-statistic will be based on the same
873 observations.
>>This probably won't change their values that much from those based on
876 observations.
Still, wouldn't one compare models using the AIC statistic?
>>Yes.
If the Type 3 statistic for the variable added is also significant,
shouldn't I still conclude that the second model is the better model?
>>Yes.
>>
>>However, because the AIC statistic and the c-statistic are calculated
differently,
>>changes in their values due to adding an additional variable
apparently are not linearly
>>proportional, as you have shown.
>>The only "easy" way I know of to calculate the statistical
significance of a c-statistic
>>is indirectly through its usually asymmetrical 95% confidence interval
generated from
>>multiple bootstrap replications. As I recall, though only vaguely,
other methods for calculating
>>its statistical significance rely on either assuming bivariate normal
distributions
>>for the sensitivity and (1-specificity) or using a Wilcoxon statistic.
-Mary
----- Original Message -----
From: Kevin Roland Viel
To: SAS-L@LISTSERV.UGA.EDU
Sent: Wednesday, August 29, 2007 12:11 PM
Subject: Re: AIC versus c statistic in Proc Logistic
On Wed, 29 Aug 2007, Mary wrote:
> Hi,
>
> I'm working on Proc Logistic and have a question about the AIC
statistic versus the c statistic.
>
> I did take the SAS Categorical Data Analysis class, and was taught
there to look for the smallest AIC statistic, and the largest c
statistic.
>
> I'm now comparing two models:
>
> 1. AIC= 983.21, c statistic= .74
> 2. AIC= 940.17, c statistic= .757
>
>
> Model 2 has the same variables as model 1, with one additional
variable, whose type 3 effect is significant at .01.
>
> My question is, why don't I see a bigger jump in the c statistic
(about
> a 2% rise) when the AIC statisitic is dropping substantially (about
a 5%
> drop)?
Have you verified that the same observations contribute to both
models?
If that one additional variable has missing data, the use of AIC is
not
straight forward.
Kevin
Kevin Viel, PhD
Post-doctoral fellow
Department of Genetics
Southwest Foundation for Biomedical Research
San Antonio, TX 78227
|