| Date: | Thu, 8 Nov 2007 11:44:58 -0500 |
| Reply-To: | Peter Flom <peterflomconsulting@mindspring.com> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Peter Flom <peterflomconsulting@MINDSPRING.COM> |
| Subject: | Re: Univariate tests before multivariate modeling in logistic
regression |
|
| Content-Type: | text/plain; charset=UTF-8 |
"cat.." <cat.b41@GMAIL.COM> wrote
>
>I'd like to get your opinion of statistician about one think that
>suprised me in a publication I read.
>
>It suggests a strategy for fitting a multivariate logistic regression.
>
>- Identification of a primary set of covariates (risk factors)
>- Performing of univariate testings:
> Covariate * exposition in disease free subjects --> p-value1
Your subject line indicates univariate tests before multivariate modeling, but in the body of the message, I don't see any multivatiate modeling at all.
There isn't anything wrong with doing univariate testing (I would call it bivariate, but no big deal) before multivariate. Exploring your data in multiple ways is a good idea. But there is something wrong with using bivariate screening as a variable selection tool. For one thing, a variable might be important only after controlling for another variable.
Could you provide some more details on what the authors did?
How did they identify a set of covariates?
What univariate (bivariate) testing did they do?
How many candidate independent variables were there?
What sample size?
What is the state of theory about the relationship between the DV and the IV? If there is strong theory, then the approach will be different than if the research is more exploratory
A good book on this is Frank Harrell's Regression Modeling Strategies
Other areas to explore are partial least squares, principal component regression, the lasso, least angle regression, and multimodel averaging
Hope this helps
Peter
|