Date: Fri, 8 Oct 1999 16:09:02 -0700
Reply-To: David Cassell <cassell@MERCURY.COR.EPA.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David Cassell <cassell@MERCURY.COR.EPA.GOV>
Organization: OAO Corp.
Subject: Re: Replacing Missing values with the Mean
Content-Type: text/plain; charset=us-ascii
JGerstle@SW.UA.EDU wrote:
>
> I haven't had any luck finding a solution to this in the manuals.
> Hopefully someone can help.
>
> Using SAS 6.12 on Win95, I need to run several PROC CORR
> ALPHA on a dataset with several missing values spread throughout
> it. According to the SAS Log, it's recommended to use NOMISS
> when calculating ALPHA. This is fine, except it leaves us with a
> low n. We would like to replace each missing value with the mean
> of that variable. Is there an option within PROC CORR or another
> PROC to do this automatically or will I have to calculate the means
> beforehand and run a data step to replace the missing values. I
> have around 40 odd variables so I'd rather do the former than the
> latter.
It's straightforward to do. One way is via PROC SQL.
Another is to use PROC SUMMARY or PROC MEANS to get the means
in a new dataset and combine the info. But I'm not writing
any code here, because I want to say this:
This is a bad idea in many cases. Please do not impute
data like this unless you can show that this is going to
be valid, and will not drive the results [as appears likely
given your concern about small n].
Try a few plots for yourself. Take a nice dataset you have
already, say n=50, and plot X vs Y. Look at the statistics.
Now replace half the X's with the mean of X, and redo.
You may see an enormous difference. It will fluctuate
depending on the data.
So don't do this if you can possibly avoid it.
David
--
David Cassell, OAO cassell@mail.cor.epa.gov
Senior computing specialist
mathematical statistician