LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2005, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 25 May 2005 15:15:16 -0700
Reply-To:     cassell.david@EPAMAIL.EPA.GOV
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject:      Re: student t test after multiple imputation
In-Reply-To:  <1117052583.858288.13020@o13g2000cwo.googlegroups.com>
Content-type: text/plain; charset=US-ASCII

sunwenyu@GMAIL.COM wrote: > I have a question about how to combine student t test results after > multiple imputations. Can I get just one P value by combining multiple > t test results? I couldn't find an example from SAS documentation, > though SAS did provide samples on how to combine results from > regression analysis or mixed model analysis by using the MIANALYZE > procedure.

First of all, what are you trying to test? I'm assuming you have a data set with some holes in it. Are you trying to take a variable Y and test H0: mu=0? Are you trying to test H0: mu=mu0 whre mu0 is a non-zero constant? Are you trying to create a confidence interval for mu?

Second, are your Y values normally distributed? And independent? And identically distributed? Oh, good. Because, even though the basic t test is relatively robust to departures from normality, things can go haywire just by tossing in a few outliers, or adding a contaminating distribution, or including serial correlation, or... If your underlying assumptions are not met, then you should look at a different test.

For a simple t-test, PROC MI and PROC MIANALYZE are simple. In particular, when oyu have more than one variable you want to test, you can do everything in PROC MI *without* ever jumping through something like PROC UNIVARIATE and then passing the results back into PROC MIANALYZE. Take a look at this example (SAS 9.1.2):

/* X is normally distributed. Y has a couple outliers. */

data temp1(drop=seed); seed = 38596303; do n = 1 to 36; x = 42 + 10*rannor(seed); y = x + ( mod(n,10)=0 )*40*ranuni(seed); output; if n in (8,16,24,32) then do; y=.; output; end; if n in (4,12,20,28,36) then do; x=.; output; end; end; run;

proc print data=temp1; run; /* take a look at the data if you want */

/* PROC MI will do the t-tests for you, and even let you choose the mu0 that you want for each of your variables. */

proc mi data=temp1 out=mi1 nimpute=5 seed=58703654 mu0=40 40 ; var x y; run;

If you run this code, you'll see that PROC MI will do the imputation for you. The default is what you want for normally distributed data when you're not trying to impute Y as a linear regression on X (or several IVs). You'll get a single-chain MCMC. (Note for people following the AR(1) thread going yesterday and today: the MCMC uses a burn-in period of 200 iterations.) Then you get information on the variance within and between imputations. This tells you how much of the noise you see is due to the activity of filling in your gaps. Finally, you get the tests you wanted. Note that the mean of Y is only a little higher than the mean of X, and the variance of Y is only a bit larger than the variance of X. But the p-values are quite distinct.

If you *want* to run things through PROC MIANALYZE, you still can. Let PROC MI build your imputations. Then run the output data set through PROC MEANS, PROC SUMMARY, or PROC UNIVARIATE to get the means and standard errors BY _IMPUTATION_ . Take that data and feed that straight into PROC MIANALYZE using the DATA= option and EDF= (give it the number of records - 1 for the complete-data degrees of freedom).

HTH, David -- David Cassell, CSC Cassell.David@epa.gov Senior computing specialist mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page