Date: Tue, 6 Dec 2005 19:04:36 -0800
Reply-To: shiling99@YAHOO.COM
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: shiling99@YAHOO.COM
Organization: http://groups.google.com
Subject: Re: R Square
In-Reply-To: <200512062038.jB6KE12R020379@mailgw.cc.uga.edu>
Content-Type: text/plain; charset="iso-8859-1"
Suppose y= a+ b1*x + err.
err ~ normal(0, sigma)
Here is the proof.
R**2=SSreg/SStotal=sum over obs (Yhat-Ybar)**2/sum over obs (Y-Ybar)**2
then E(SSreg)=sigma**2 + b1**2*Sxx and E(SSres)=(n-2)*sigma**2
Sxx=sum over obs (X-Xbar)**2
E(SStotal)=E(SSreg) + E(SSres) = )=(n-1)*sigma**2 + b1**2*Sxx
So R**2=(sigma**2 + b1**2*Sxx ) /[(n-1)*sigma**2 + b1**2*Sxx ]
Let w=b1**2*Sxx /n*sigma**2, then R**2 in terms w will be
R**2=[n**(-1) + w] /[1-n**(-1)+w]
It is easy to show that the derivertive of R**2 w.r.t w is gt 0.
If you deem b1**2*Sxx /n*sigma**2 as a signal to noise ratio, higher
the ratio, better result/model.
Higher range of X is the same to say bigger Sxx.
Here is a simulation pgm.
data t1;
do i = 1 to 100;
x=rannor(123);
y=2+2*x+rannor(123);
output;
end;
run;
proc reg data=t1;
model y=x;
where abs(x)<0.5;
run;
proc reg data=t1;
model y=x;
where abs(x)>0.5;
run;
HTH.