LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (May 2005, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 11 May 2005 08:52:28 -0700
Reply-To:     gblockhart@YAHOO.COM
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         gblockhart@YAHOO.COM
Organization: http://groups.google.com
Subject:      Dependent sample difference in mean test
Comments: To: sas-l@uga.edu
Content-Type: text/plain; charset="iso-8859-1"

I have two dependent samples with different numbers of observations. I need to know whether the means of the two samples are statistically different from each other.

My sample_1 has approximately 800,000 observations. Sample_2 has approximately 130,000 observations.

I have run a regression on sample_1 to generate coefficients. I then "fit" the coefficients from sample_1 to the characteristics of sample_2 observations. This gives me a predicted value for sample_2 based on sample_1 coefficients. I then calculate a residual by subtracting each sample_2 observation actual value from the predicted value (predicted from the sample_1 coefficients applied to the sample_2 characteristics).

Then I take the mean of the residuals from sample_2.

I repeat the process in the opposite, i.e., I run a regression on sample_2, get coefficients, then fit the coeffificients from sample_2 to the sample_1 characteristics. This generates a predicted value, which I subtract from each sample_1 actual - this generates the sample_1 residuals. I then take the mean sample_1 residual.

I expect the sample_1 and sample_2 residuals to be of opposite sign. I need to test the difference in the mean residuals. I have two dependent samples (of residuals) and I have very different sample sizes (of residuals).

I can make the assumption that they are perfectly negatively correlated and proceed with a t-test. Then assume that they are perfectly uncorrelated and proceed with a t-test. This will give me a range of t-stats for my test.

But, I was hoping someone could help me with a stronger (or more direct) test. I'm afraid the range won't give strong enough results.

So, this is a statistical theory question instead of a direct SAS question.

Thanks.


Back to: Top of message | Previous page | Main SAS-L page