LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2008, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 16 May 2008 10:35:14 -0700
Reply-To:     Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:      Re: PROC GLIMMIX--How to Check Linearity Assumption
In-Reply-To:  <81074a80-9057-4b89-b627-6239b8762e3a@i76g2000hsf.googlegroups.com>
Content-Type: text/plain; charset=iso-8859-1

--- Shiling Zhang <shiling99@YAHOO.COM> wrote:

> > How do I check that the 50K obs of the above predictors are > linearly related to the FRAUD variable? > > It is neither necessary nor sufficient. In fact the model assumes > that > {logodds of FRAUD} NOT FRAUD is linearly related to your linearly > predictors. > > Here is a way to views it. > 1) Bin a predictor into, say 30 bins. i=1 to 30 > 2) Calculate logodds of FRAUD for each bin. > 3) plot logodds of FRAUD against bined predictor values(mean, > median) > > Based on what you see, you may take a proper transformation. One is > parametric and the other is non-parametric. If you have a large > number > of events( FRAUD), the non-parametric way is prefered. ...... > > HTH >

Yes, this is most certainly one way to examine the linearity assumption. This approach works best if you have only a single predictor variable. In the multivariate setting where the effect of each predictor is conditional on the effects of other predictors, then this approach may not work as well. Also, I don't see the need for any more than 10 bins in most circumstances.

Another way to examine whether linearity holds is to include terms in your model which represent some departure from linearity. If there is significant improvement in the model fit when these terms are included, then the assumption of linearity in the predictors does not hold. Often, this is performed simply by including polynomials of your predictors. A little more sophisticated approach may be to employ a spline basis for representing nonlinearity. My favorite spline basis is to use restricted cubic splines as discussed in

Harrell, Frank. "Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis." Springer, 2001.

There are a couple of SAS macros available for generating splined variables. Go to http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/SasMacros and follow the links from there.

HTH,

Dale

--------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------


Back to: Top of message | Previous page | Main SAS-L page