| Date: | Sat, 9 Oct 1999 19:02:29 GMT |
| Reply-To: | bruce12@spacelab.net |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Bruce Erlichman <bruce12@SPACELAB.NET> |
| Organization: | B. Erlichman, Inc. |
| Subject: | Re: Replacing Missing values with the Mean |
|---|
See Proc Standard for easy way to mean-fill.
On Fri, 08 Oct 1999 16:09:02 -0700, David Cassell
<cassell@mail.cor.epa.gov> wrote:
>JGerstle@SW.UA.EDU wrote:
>>
>> I haven't had any luck finding a solution to this in the manuals.
>> Hopefully someone can help.
>>
>> Using SAS 6.12 on Win95, I need to run several PROC CORR
>> ALPHA on a dataset with several missing values spread throughout
>> it. According to the SAS Log, it's recommended to use NOMISS
>> when calculating ALPHA. This is fine, except it leaves us with a
>> low n. We would like to replace each missing value with the mean
>> of that variable. Is there an option within PROC CORR or another
>> PROC to do this automatically or will I have to calculate the means
>> beforehand and run a data step to replace the missing values. I
>> have around 40 odd variables so I'd rather do the former than the
>> latter.
>
>It's straightforward to do. One way is via PROC SQL.
>Another is to use PROC SUMMARY or PROC MEANS to get the means
>in a new dataset and combine the info. But I'm not writing
>any code here, because I want to say this:
>
>This is a bad idea in many cases. Please do not impute
>data like this unless you can show that this is going to
>be valid, and will not drive the results [as appears likely
>given your concern about small n].
>
>Try a few plots for yourself. Take a nice dataset you have
>already, say n=50, and plot X vs Y. Look at the statistics.
>Now replace half the X's with the mean of X, and redo.
>You may see an enormous difference. It will fluctuate
>depending on the data.
>
>So don't do this if you can possibly avoid it.
>
>David
>--
>David Cassell, OAO cassell@mail.cor.epa.gov
>Senior computing specialist
>mathematical statistician
|