LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2010, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 15 Jul 2010 07:13:42 -0400
Reply-To:     peterflomconsulting@mindspring.com
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Flom <peterflomconsulting@MINDSPRING.COM>
Subject:      Re: Step-Wise Methods re-evaluated
Comments: To: David J Moriarty <djmoriarty@CSUPOMONA.EDU>
In-Reply-To:  <20100714234740.A090910008F@adler.unx.csupomona.edu>
Content-Type: text/plain; charset="us-ascii"

David J Moriarty wrote <<< My opinion is that of a biologist, not a statistician, but I think there is a role for stepwise methods. If all you are interested in is the best possible prediction, then step-wise doesn't seem advantageous. Use all available predictors; generally, the more preditors the better.

But if you have a large number of potential predictors, and you're trying to identify a relevant subset of those predictors, then I think step-wise methods can be part of the effort. We need to realize the substantial problems with the methods, as pointed out by our statistical colleagues, and judge the results accordingly. But step-wise methods may illuminate patterns in a large data set that might be important. Why were certain predictors included? Why were others excluded? Is it something relevant,or just a fault in the method? Step-wise might get us to ask some questions that could be important.

I recommend step-wise methods only in a heuristic sense - they help elucidate patterns and ask questions. But they should only be used a small part of a large, comprehensive analysis of the data. If a student brings me a thesis where all that's been done with the data is some step-wise method, and all biological conclusions come from that method - well, that's unacceptable. I would tell them they have just barely got started in terms of understanding their data. >>>>

This makes some sense. Anything that gets you to "ask questions that could be important" is a good thing.

I'd suggest, however, that there are now better ways to get these questions raised; use PROC GLMSELECT and vary the parameters. See what happens. Another alternative (although, I believe, it isn't in SAS STAT, only in some add on) is to use tree methods as exploratory tools. These can really open up the field of questions and illuminate patterns that are difficult or impossible to find with traditional regression, regardless of variable selection method.

Peter


Back to: Top of message | Previous page | Main SAS-L page