LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2002, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 3 Jul 2002 09:24:12 -0700
Reply-To:     Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:      Re: Proc to Cross-Validate Proc Logistic (Proc or macro)
Comments: To: mark.k.moran@CENSUS.GOV
In-Reply-To:  <OF629DA2F6.7263CABE-ON85256BEB.004C67B6@tco.census.gov>
Content-Type: text/plain; charset=us-ascii

Mark,

The leave-out-one approach to crossvalidation is a classic crossvalidation strategy dating from the 1960's. Mosteller and Wallace (1960, JASA 58, 275-309) first suggested the leave-out-one approach. Mosteller and Tukey (1968, Handbook of Social Psychology, G Lindzey and E Aronson, eds. Addison-Wesley) and Lachenbruch and Mickey (1968, Technometrics, 10, 1-11) were the first real applications of the leave-out-one approach. In the leave-out-one approach, each observation from 1 to N is dropped from the fitting model and is used as the validation sample. So you have N models and corresponding validation samples. In linear models, the regression coefficients can be quickly updated when you employ a leave-out-one strategy. For nonlinear models such as in logistic regression, the parameter estimates can only be approximated with a fast algorithm. One would really need to iterate on the original data rather than employing the "hat" matrix if you wished to obtain the maximum likelihood estimates of the parameters when the i-th observation is dropped. But iterating on the original data would be extremely time consuming, without much benefit in terms of precision of the parameter estimates in most cases. Therefore, the standard implementation of crossvalidation is now the leave-out-one approach with approximation of the parameter estimates. Any other crossvalidation approach would need explicit operationalization. This as much as anything is why the leave-out-one approach is a standard implementation. The leave-out-one approach is already operationally well defined, regardless of the particular sample that you are working with.

Dale

--- Mark Moran <mark.k.moran@CENSUS.GOV> wrote: > Dale, my friend's dataset contains hundreds of thousands of records. > Are > you saying that from all these records this will drop "one" > observation > from it and fit the regression? Isn't there a much more general way > to > create a crossvalidation for any regression routine, macro's already > written for such a purpose? > > Mark > > ------------------------------------------------------------------------------------------------ > > Mark, > > The classification table generated with the CTABLE option to the > MODEL statement of PROC LOGISTIC is generated through a leave-out-one > approximate crossvalidation. I say approximate crossvalidation > since the parameter estimates when the i-th observation is dropped > are not the full maximum likelihood estimates, but a one-step > approximation to the parameter estimates. I am sure that a macro > could be easily written to handle the crossvalidation when more > than one observation is dropped at any one time. > > Dale > > > --- Mark Moran <mark.k.moran@CENSUS.GOV> wrote: > > My colleague is running a PROC LOGISTIC in SAS 8.2 with all > > categorical variables (dummy variables and interactions, he says). > He > > wants to be able to cross-validate his model, splitting the data > into > > 10 pieces, reserving 1, predicting, moving on to another split, > etc. Is > there an > > existing way to accomplish this in SAS (new SAS 8 Proc would be the > first > choice, > > or second a macro)? > > > > Mark Moran

===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

__________________________________________________ Do You Yahoo!? Sign up for SBC Yahoo! Dial - First Month Free http://sbc.yahoo.com


Back to: Top of message | Previous page | Main SAS-L page