| Date: | Tue, 6 Jan 2009 10:12:02 -0800 |
| Reply-To: | stringplayer_2@yahoo.com |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Dale McLerran <stringplayer_2@YAHOO.COM> |
| Subject: | Re: PROC LOGISTIC tests |
| In-Reply-To: | <ed96582c-03a4-4085-89f5-5c04cb39acf9@o40g2000prn.googlegroups.com> |
| Content-Type: | text/plain; charset=utf-8 |
--- On Tue, 1/6/09, AgEconomist <matttbogard@GMAIL.COM> wrote:
> From: AgEconomist <matttbogard@GMAIL.COM>
> Subject: PROC LOGISTIC tests
> To: SAS-L@LISTSERV.UGA.EDU
> Date: Tuesday, January 6, 2009, 9:15 AM
> Are there any commands in SAS that would test a logit model in PROC
> LOGISTIC for multicollinearity, heteroskedasticity, or serial
> correlation ? PROC REG has the VIF, DW options in the model statement
> but not in PROC LOGISTIC. I could probably write a routine, but
> frankly, I’m not even sure about how to get the ‘residuals’ necessary
> for some of these tests. I know that RESDEV and RESCHI in the model
> statement gives residuals, but can I treat these the way I would treat
> residuals from OLS?
Matt,
First of all, collinearity does not depend on the left-hand side
variable. Residuals have no bearing on the issue of collinearity.
Thus, you can use the collinearity diagnostics available in PROC
REG to assess the magnitude of any collinearity issues. By the
way, I would suggest use of the COLLIN option available on the
MODEL statement in PROC REG rather than VIF for assessing problems
due to collinearity. To be explicit, you can assess collinearity
issues by fitting the "wrong" model through code like the
following:
proc reg data=mydata;
model <binary response> = <predictors> / collin;
run;
Of course, if any of your predictor variables are categorical
with K levels, then you will need to construct a set of K-1 dummy
variables beforehand which are named in the <predictors> list.
Heteroskedasticity and serial correlation are issues related to
the response. But please observe that for a binary response,
heteroskedasticity is not an issue. Overdispersion can be an
issue. An overdispersed binomial occurs when there is variation
in the probability of a success across units (subjects). There
are a number of ways of dealing with this overdispersion.
Basically, an overdispersed binomial is similar to a binomial
with serial correlation in that there is some form of
nonindependence among the observations.
This brings us to the topic of serial correlation. Serial
correlation could be an issue. But if you have correlated
responses, you should not be using the LOGISTIC procedure to
begin with. You should be using one of the many procedures
available in SAS for dealing with correlated binary response
values. These include:
1) the GENMOD procedure with a REPEATED statement
2) the GLIMMIX procedure
3) the NLMIXED procedure
This is an incomplete list. No doubt there are other procedures
for fitting a logistic regression model where there are
correlated responses.
So, you might want to post again to SAS-L with a description of
the data you have at hand and the problems that you believe are
present in those data. A more helpful response may be offered
if you give a more careful description of the problems that are
presented to you in the data you must analyze.
Dale
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
|