LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 1999, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 22 Mar 1999 13:27:27 +0000
Reply-To:     Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Subject:      SAS/STAT Stepwise selection
In-Reply-To:  <1E362FD41C5@violet.le.ac.uk>

B. Manktelow <bm18@LEICESTER.AC.UK> writes >Hi, >I am hoping that somebody can suggest a method for using automatic >variable selection in procedures that do not have the CLASS option >(eg LOGISTIC or PHREG). >The problem a colleague of mine has is that she has created three >dummy variables for a factor with four levels. However, she requires >that they are recognised as being the some variable and are therefore >added or removed from the model together. SAS appears to >treat each dummy variable as a seperate variable. >Any suggestions on a way around this? We can't find anything in the >documentation. >(NOTE: >1. We are fully aware of all of the dangers of automatic variable >selection etc; >2. I am a statistician so please be gentle with me!!!) > >Thanks >Brad >bm18@le.ac.uk

Would it be possible to reconsider those "level"s - ? If like age (of anything) there is an underlying continuous variable which is converted into dummy variables for just this logistic exercise, then you may achieve just what you were looking for by generating those dummy vars with a "less-discrete" nature.

up to 18 18 to 30 30 to 50 50 to 65 etc may offer an obvious set of dummy variables, but they have to be taken into procedures like logistic, as a complete group.

But if those ranges were not defined as discrete, they could become independant dummy vars; for example if these categories under 18, under 30, under 50, under 65 are encoded in dummy vars, then, because they each divide the full range at one point, they would be independant, and can come or go in the stepwise selection process without requiring their partners.

You may want to modify the exercise by replacing these categorical dummies with their opposite numbers over18, over30, over50, over65 depending on the data model and objectives for the analysis.

Not being adequately versed in the speciality, I'm not certain their effect is different or conversely, at risk of introducing noncolinearity when both sets are combined

Is this already a standard approach, or invalidated for any reason ? -- Peter Crawford


Back to: Top of message | Previous page | Main SAS-L page