LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 5 Feb 2008 13:02:59 -0600
Reply-To:   "Peck, Jon" <peck@spss.com>
Sender:   "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:   "Peck, Jon" <peck@spss.com>
Subject:   Re: R^2 computation in SPSS
Comments:   To: Joanne Tsai <jtsai@targetrx.com>
In-Reply-To:   <B2A95412067E5C4CBA09E2E92D81BF290B38F584@TRX-V01.targetrx.com>
Content-Type:   text/plain; charset="utf-8"

There are two issues here. First, you are using different samples when you go from listwise to pairwise deletion. There could be population characteristics that differ, especially if values are not missing at random. Imagine, for example a situation where men rarely answer some question while women usually answer. Then the pairwise-sample gender proportion will be very different from the listwise one, and if males and females differ in the regression response, the results will be quite different in the two samples.

Second, the residual means are doubtless different. Do Descriptives on them. You will see how the contribution to the R^2 from the residual means differ. You might also look at regression diagnostics.

HTH, Jon Peck

-----Original Message----- From: Joanne Tsai [mailto:jtsai@targetrx.com] Sent: Tuesday, February 05, 2008 11:54 AM To: Peck, Jon; SPSSX-L@LISTSERV.UGA.EDU Subject: RE: Re: [SPSSX-L] R^2 computation in SPSS

Hi, Jon Sorry I didn't make my question clear. I meant to ask, by trying both listwise and pairwise, I observed that both sets of estimated coefficients are similar though R^2 seemed to perform a lot better with the pairwise. I am very curious of the reason behind it. Can I get better coefficients by using pairwise since it doesn't throw out any data? And how is R^2 computed by using pairwise, why is it a lot better than the R^2 done listwise?

-----Original Message----- From: Peck, Jon [mailto:peck@spss.com] Sent: Tuesday, February 05, 2008 1:47 PM To: Joanne Tsai; SPSSX-L@LISTSERV.UGA.EDU Subject: RE: Re: [SPSSX-L] R^2 computation in SPSS

Regarding the R^2, when there is a constant term in the regression, the residuals have mean zero, so the sums of squares in the numerator and denominator match up with correlation coefficients. If there is no constant term, the residual mean is not zero, so the sums of squares in both numerator and denominator have a contribution from the mean square, so the explained/total sum of squares will be closer to one.

Now, here is the quiz for today: construct an ordinary least squares linear regression example where ALL of the residuals are positive.

Regards, Jon Peck

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Joanne Tsai Sent: Tuesday, February 05, 2008 11:30 AM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: [SPSSX-L] R^2 computation in SPSS

Thank you for the answer. Is there anyway I can find out why the coeffecient estimates using two different methods are similar, but R^2 is not. (I will be throwing out 25% of data if using listwise) I am assuming the model should go through the origin, so the second question is fully answered. Thank you.

-----Original Message----- From: Peck, Jon [mailto:peck@spss.com] Sent: Tuesday, February 05, 2008 11:17 AM To: Joanne Tsai; SPSSX-L@LISTSERV.UGA.EDU Subject: RE: Re: [SPSSX-L] R^2 computation in SPSS

If you use pairwise deletion, you can't be sure of the statistical properties of your regression estimates. Pairwise deletion is rarely appropriate. In fact, with pairwise deletion you can't even be sure that the covariance matrix is positive definite. Stick with listwise deletion.

As for the constant term, think of the model you are testing. Omitting the constant term is perfectly appropriate if your model implies that the regression line should go through the origin and you are confident of linearity. In most cases, though, you should just keep the constant term and not test it for significance. Forcing the regression line through the origin does produce an R^2 that isn't really comparable to the usual one.

HTH, Jon Peck

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Joanne Tsai Sent: Tuesday, February 05, 2008 9:01 AM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: [SPSSX-L] R^2 computation in SPSS

1. Yes, if I do use the listwise, R^2 is similar between Excel and SPSS. But which R^2 is more reliable? I have 0.85 for pairwise, and 0.65 for listwise. I'd love to show the higher R^2, but would not want to draw a wrong conclusion based on it. Or is there any other tool that I can plot the graph and get the similar 0.85? IS there anywhere I can find more information in terms of the algorithm for pairwise? 2. When I run the linear regression including the constant, the p-value on the constant is 0.91, so I would think it's not significant. Can I remove the constant just based on the P-value I got, is it fair? Thank you so much for your pointers!

-----Original Message----- From: ViAnn Beadle [mailto:vab88011@gmail.com] Sent: Tuesday, February 05, 2008 9:01 AM To: Joanne Tsai; SPSSX-L@LISTSERV.UGA.EDU Subject: RE: R^2 computation in SPSS

Try your SPSS analysis again using listwise deletion of missing data. I'd guess you'll get the same results as Excel which AFAIK doesn't have an algorithm for pairwise.

When you do not include the constant, you are testing an entirely different model--that the relation is not significantly different from 0. Is that what you want?

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Joanne Tsai Sent: Tuesday, February 05, 2008 6:48 AM To: SPSSX-L@LISTSERV.UGA.EDU Subject: R^2 computation in SPSS

Dear Co-listers: I have recently encountered this following question: I got a pretty good R^2 estimation using Linear Regression model in SPSS, 0.85. (Not all sample points have all the dependent as well as independent variables, so I used the pairwise option.) But when I plotted the predicted number vs actual number (my dependent variable) in excel and curve expert, I can only get R^2 around 0.50 I am not sure what's causing this discrepancy, is it due to the computation in SPSS or because of the fact that it's computed pariwise? The other question I have is that what can one say about the result when one uses the linear regression model without including the constant? The R^2 is higher, but isn't that biased? Can one still use it as a validation method? Thank you so much for your help!

Joanne

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page