Date: Wed, 2 Jan 2008 14:56:39 -0600
Reply-To: Mary <mlhoward@avalon.net>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mary <mlhoward@AVALON.NET>
Subject: Re: A conceptual question about regression
Content-Type: text/plain; charset="iso-8859-1"
Tommy,
It doesn't really seem quite right to me. For instance, I do opthalmology research and the presence of drusen (inflammatory spots) almost always occurs before the occurance of the disease AMD, and a certain GENE SNP, a factor H SNP, would occur at birth (since it is inherited or an abnormality in the gene occurs at birth. So in this example would we want to have a model saying that we want to predict the factor H abnomality given independent variables of having AMD or not and having drusen or not?
In this example, it doesn't seem quite right to me- the dependent variable would seem to be the final event, not the initial event.
Thus a model
model amd_disease= b0 + b1*factorh_snp + b2*drusen_present;
makes more sense to me than a model like
model factorh_snp = b0 + b1*drusen_present + b2*amd_disease;
So I'd agree with you theoritically, it seems backward to use the event that first occurs. Perhaps you could let us know what your variables are to see if there might be a situation that makes more sense.
-Mary
----- Original Message -----
From: Tommy Xie
To: SAS-L@LISTSERV.UGA.EDU
Sent: Wednesday, January 02, 2008 2:12 PM
Subject: A conceptual question about regression
Hi all,
I'd appreciate if anyone can help me answer this question. If variable
X takes place before Y and Z, which may indicates some sort of causal
effect, is it O.K. to run a regression like X=B1*Y+B2*Z? I suspect
that the regression has a wrong direction.
Thanks in advance!