| Date: | Tue, 23 Jul 2002 10:05:54 -0400 |
| Reply-To: | Steve Albert <salbert@AOL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Steve Albert <salbert@AOL.COM> |
| Subject: | Re: Zeros vs missing |
|
I'd disagree with Jim Groenevald's last suggestion here:
>Hi William,
>
>I think it depends on the nature of your (missing) data and the
>analysis that you want ot do with it. Those missing data, are they
>missing by coincidence, but should not be missing at all? Then they
>are missing. Or are they missing, because there are no such symptoms?
>Then they have a 0-value for the presence of symptoms. However, you
>may analyse your sample(s) as one group (with 0-values as well), e.g.
>to calculate means and proportions, but you may also regard them as
>two different groups, those with and those without symptoms. Anyway,
>my preference would be to regard (and recode) those missing values
>as 0, because they conceptually are 0, not unknown.
As others pointed out, there is a qualitative difference between having
symptoms and not having symptoms. Coding no symptoms as "0" implies that
the difference between no symptoms (0) and very mild symptoms (1) has the
same effect as does the difference between any other two adjacent symptom
score values, say 8 and 9. I don't think that's likely to be a good
assumption, and linearity would not hold.
Someone earlier (Paul Swank?) suggested looking at both qualitative and
quantitative effects; if you include a 0/1 dummy for presence of each
symptom, then recoding the missings to zero would no longer cause the same
problems.
All of this of course assumes that the misssings really indicate that the
patient with a missing value for a symptom did not have the symptom, rather
than merely indicating that the data was not collected.
Steve Albert
Director of Biostatistics
Spectrum Pharmaceutical Research Corp.
San Antonio, TX
SAlbert at SpectrumCRO dot com
|