Date: Wed, 10 Apr 2002 09:13:00 -0700
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV>
Subject: Re: on the Class or not?
Content-type: text/plain; charset=us-ascii
paula D <sophe@USA.NET> wrote:
> I have a continuous variable, say, var1 that is just like a $ amount
> Now I take var1=25 as a cut-off value to generate another variable
> if var1 GE 25 then var2=1; else var2=0;
> Now Var2 only has 2 values.
> I use proc logistic to run a model. The question is: should var2 be
> put on the variable list of the Class statement? The for argument is
> that var2 only has 2 values and surely it categorizes var1. The
> against argument is that var2=1 is still ordinal because var2=1 is >
Okay, first VAR2 should be considered a categorical variable [under
most conditions]. You could just as easily have coded it as 'above'
and 'below', or 'yes/no', or classes 'A' and 'B'.
Second, if you want it in your model you should put it in your CLASS
statement. Unless you have specific reasons for treating it as a
continuous variable. I once was presented with a 'class' variable
that took values of 1, 2, 3, 4, 5 - but then I found out that those
values were artifacts of the way the study was set up, and the scientist
really wanted this as a continuous variable with an estimated slope.
So you have to consider the problem carefully. But yours looks like
a clear category variable.
> I tried both in my exercise and they are no that different as far as
> variable selection is concerned, unless var2 is highly correlated with
> other predictors.
Well, VAR2 *has* to be highly correlated with VAR1 - among others.
I'll guess that if you check using PROC CORR you'll find that the
correlation between your VAR1 and VAR2 is 1.00 . ;-)
I suggest that you not use VAR1 and VAR2 in the same model. They
are telling youthe same information, and VAR2 clearly has less
than VAR1 because of the 'binning'.
David Cassell, CSC
Senior computing specialist