Date: Wed, 3 Jul 2002 13:57:38 0400
ReplyTo: mark.k.moran@CENSUS.GOV
Sender: "SAS(r) Discussion" <SASL@LISTSERV.UGA.EDU>
From: Mark Moran <mark.k.moran@CENSUS.GOV>
Subject: Re: Proc to CrossValidate Proc Logistic (Proc or macro)
Contenttype: text/plain; charset=usascii
Thank you, Dale, you've been a great help.
Mark
Dale McLerran
<stringplayer_2@ To: mark.k.moran@CENSUS.GOV,
yahoo.com> SASL@LISTSERV.UGA.EDU
cc:
07/03/2002 01:22 Subject: Re: Proc to CrossValidate Proc
PM Logistic (Proc or macro)
Mark,
As I indicated in my previous reply, the leaveoutone approach
does not fit the full maximum likelihood parameter estimates for
each sample. Rather, it uses an approximation to the parameter
estimates which can be computed with extreme efficiency. It takes
very little time to form the leaveoutone approximate parameter
estimates. I don't have time to go into the details of how it
is done, but if colleague has a PhD in statistics, he would
understand quite quickly the efficiency of the approach if he were
to look at the documentation pertaining to the CTABLE option.
Dale
 Mark Moran <mark.k.moran@CENSUS.GOV> wrote:
> Obviously I am relatively untutored on crossvalidation. My colleague
> has a
> PhD and his approach is to divide the data into 10 subsamples, fit a
> model
> using I suppose 9 of the 10 subsamples to predict the 10th, change to
> a
> different 9 subsamples to predict the 10th, etc. This would require
> only
> 10 models. If there are 500,000 records, then by the leaveoneout
> method
> of crossvalidation how long will it take to create 500,000 models in
> PROC
> LOGISTIC by this method? Wouldn't this quickly become cumbersome
> with so
> many observations? I think he is predicting from 5 or 6 predictors.
>
> Mark Moran
>
=====

Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@fhcrc.org
Ph: (206) 6672926
Fax: (206) 6675977

__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial  First Month Free
http://sbc.yahoo.com
