LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 8 Oct 2008 07:56:39 -0400
Reply-To:   Nathaniel.Wooding@DOM.COM
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Nat Wooding <Nathaniel.Wooding@DOM.COM>
Subject:   Re: stepwise
In-Reply-To:   <>
Content-Type:   text/plain; charset="US-ASCII"

In addition to the source that Peter suggested, let me also suggest the paper that he and David Cassell wrote last year. It appeared in several places but you can see it at

In the paper, they demonstrated that stepwise will find solutions to random collections of numbers.

Nat Wooding Environmental Specialist III Dominion, Environmental Biology 4111 Castlewood Rd Richmond, VA 23234 Phone:804-271-5313, Fax: 804-271-2977

Peter Flom <peterflomconsult ing@MINDSPRING.CO To M> SAS-L@LISTSERV.UGA.EDU Sent by: "SAS(r) cc Discussion" <SAS-L@LISTSERV.U Subject GA.EDU> Re: stepwise

10/07/2008 05:04 PM

Please respond to Peter Flom <peterflomconsult m>

nchapinal@YAHOO.COM wrote > >I am using proc stepwise to know which are the best predictors to >distinguish between healthy and sick people. >I know more or less how to do it. However, someone told me you can add >an option that tells you how accurate is each particular predictor >that you keep in the model in classifying people in the right >category. > >Any help is welcome!

I see that others have already responded, saying I don't recommend this. They are right. I don't. The best source on why this is bad is Frank Harrell's book on Regression Modeling Strategies

First there is, AFAIK, no such PROC as PROC STEPWISE -- it's not in SAS help, so it is hard to know just what you are doing.

Stepwise does NOT allow you to know which are the best predictors.

I can probably recommend something better, but, let me ask some questions:

1) What is your sample size? 2) Where did the sample come from? (a survey? an experiment? or what?) 3) How many independent variables (IVs) have you got? 4) What is your dependent variable? Is it dichotomous (sick vs. not)? Or a time to event (how long before you got sick)? Or something else? 5) Is the purpose of your study explanation, or prediction, or both? 6) Why did you choose the IVs you chose? 7) What does the literature say about these?




Peter L. Flom, PhD Statistical Consultant www DOT peterflom DOT com

CONFIDENTIALITY NOTICE: This electronic message contains information which may be legally confidential and/or privileged and does not in any case represent a firm ENERGY COMMODITY bid or offer relating thereto which binds the sender without an additional express written confirmation to that effect. The information is intended solely for the individual or entity named above and access by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.

Back to: Top of message | Previous page | Main SAS-L page