Date: Tue, 24 Aug 1999 14:47:39 -0400
Reply-To: Peter Flom <peter.flom@NDRI.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Flom <peter.flom@NDRI.ORG>
Subject: Re: Missing Values (... not so fast!)
Content-Type: text/plain; charset=US-ASCII
JW = John Whittington <medisci@POWERNET.COM> 08/24/99 02:37PM >>>
SRU = Statistics R Us
SRU John, The situation you describe is one of sloppy data
SRU entry: a genuine zero is entered as missing value.
JW James ... Yes, of course - but that doesn't alter the fact that it's a very
JW common phenomenon.
'Missingness' itself can be very valuable data;
especially when it is related to the data being
collected. For example, consider a situation where you
ask a random sample of politicians: do you or have you
ever used illegal or "recreational" drugs? ....
I agree entirely (mainly in relation to survey data) - but that is really
another example of 'sloppy data entry' - since what you are really talking
about here is data that should be categorised as 'declined to answer' -
which is a little different from 'truely missing' (as in 'someone spilt
coffee on that bit of the form'!).
With regard to missing data on surveys, we also need to
distinguish "missing because of deliberate skip", e.g.
we don't ask people with no kids how old their kids are.
But even this can get complicated. In our research, we ask a lot of sensitive
questions about things like use of illicit drugs. We ask about these in
several ways. So, a person who says to the question "Have you ever used
cocaine?" on one section will then not be asked "How old were you when you first
used cocaine?" But we also have another section, which the subjects
fills out on their own, which asks the same question about ever using cocaine.
For reasons of length (it's already a 2 hour questionnaire) we didn't ask
age of first use in that section.
Peter Flom, Ph.D.
Principal Research Associate
2 World Trade Center
New York, NY 10048
(212) 845-4485 (voice)
(212) 845-4698 (fax)