LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (April 2002, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 10 Apr 2002 09:13:00 -0700
Reply-To:     Cassell.David@EPAMAIL.EPA.GOV
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV>
Subject:      Re: on the Class or not?
Content-type: text/plain; charset=us-ascii

paula D <sophe@USA.NET> wrote: > I have a continuous variable, say, var1 that is just like a $ amount variable. > Now I take var1=25 as a cut-off value to generate another variable Var2. > if var1 GE 25 then var2=1; else var2=0; > Now Var2 only has 2 values. > > I use proc logistic to run a model. The question is: should var2 be > put on the variable list of the Class statement? The for argument is > that var2 only has 2 values and surely it categorizes var1. The > against argument is that var2=1 is still ordinal because var2=1 is > var2=0.

Okay, first VAR2 should be considered a categorical variable [under most conditions]. You could just as easily have coded it as 'above' and 'below', or 'yes/no', or classes 'A' and 'B'.

Second, if you want it in your model you should put it in your CLASS statement. Unless you have specific reasons for treating it as a continuous variable. I once was presented with a 'class' variable that took values of 1, 2, 3, 4, 5 - but then I found out that those values were artifacts of the way the study was set up, and the scientist really wanted this as a continuous variable with an estimated slope. So you have to consider the problem carefully. But yours looks like a clear category variable.

> I tried both in my exercise and they are no that different as far as > variable selection is concerned, unless var2 is highly correlated with > other predictors.

Well, VAR2 *has* to be highly correlated with VAR1 - among others. I'll guess that if you check using PROC CORR you'll find that the Spearman correlation between your VAR1 and VAR2 is 1.00 . ;-)

I suggest that you not use VAR1 and VAR2 in the same model. They are telling youthe same information, and VAR2 clearly has less information than VAR1 because of the 'binning'.

HTH, David -- David Cassell, CSC Cassell.David@epa.gov Senior computing specialist mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page