Date: Fri, 4 Jun 2004 15:29:04 -0700
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject: Re: stepwise linear regression modeling
Content-type: text/plain; charset=US-ASCII
Paul Thompson <paul@WUBIOS.WUSTL.EDU> sagely replied to my post:
> [me]> Your best option is: do not do stepwise regression.
100 % agree
See? I told you he was sagacious. :-)
> [me]> Seriously. I have written pages on this issue in SAS-L before.
> [me]> (You can bore yourself to tears by looking up my rants in the
> [me]> SAS-L archives at
> [me]> if you want to. Just search for the keyword 'stepwise'.)
> [me]> Particularly when you are working with interaction terms, which
> [me]> by definition will be highly correlated with other variables
> [me]> in your model, stepwise regression can do bad things. You have
> [me]> no guarantee that you will get the right term, instead of some
> [me]> higher-order term which happens to be correlated.
> There is nothing wrong with selecting terms for a final model. You
> should do it yourself, however. Fit a model. Decide which terms
> stay in. Sometimes, I retain non-sig terms - which should remain in
> SUBSTANTIVE reasons (maybe age-adjustment is sensible).
Agreed. Absolutely. Scientific systemata are more important than
arbitrary statistical cutoffs. Build a model based on sound scientific
knowledge and hypothesis, then *test* that model. Don't throw every
variable (and all their interactions) into a big hopper labeled "DANGER:
STEPWISE" and assume that the results are 'right'.
If you can't avoid working with vast numbers of inter-correlated
and you only want results like some sort of working predictive formula,
then stepwise regression is totally wrong for you. Look at PROC PLS and
similar methodlogies instead.
Hey, I agree with Paul. Does that make me sagacious too?
David, the Circular Reasoner
David Cassell, CSC
Senior computing specialist