|
What I usually do is:
data y;
set x;
if . < a < b;
run;
This make sure observations with a=. will not
be in y.
Kind regards,
Ya Huang
-----Original Message-----
From: David Gardner [mailto:gardner@MAIL.FPG.UNC.EDU]
Sent: Tuesday, May 28, 2002 10:10 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Comparisons involving missing values
Hi, everyone.
Has anyone else encountered problems with missing values
(including special numeric missings) in IF statements?
Consider the following scenario. You write a SAS program that compares
variables a and b, and create an analysis data set, and the results
are wrong because the input data set contains missing values
and the boolean "missing lt nonmissing" is evaluated as true. Has that
ever
happened to you? It is especially easy to overlook if special formats,
such
as DATE or TIME, are associated with the variable. I bet a lot of these
errors pass by undetected and become published research. (Not where
I work, of course! (LOL ;)).
SAS deems it noteworthy, as it should, when arithmetic operations are
performed
on variables containing missing values.
data x;
input a b;
cards;
1 1
2 1
. 2
2 2
;
data z;
set x;
c = a + b;
run;
NOTE: Missing values were generated as a result of performing an
operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 63:10
However, when boolean operations involve missing values, SAS is
strangely quiet.
/* The wrong way in many instances: */
data y;
set x;
if a lt b; /* may be wrong!!! (but SAS says nothing) */
run;
/* The right way uses N (or NMISS) to check the number of non-missing
(or missing) values: */
data y;
set x;
if n(a,b) eq 2 and a lt b; /* right!!! */
run;
Everyone understands that "missing lt nonmissing" is supposed to be
true, just as the result of "missing plus nonmissing" is supposed to be
missing. However, both events are probably equally
be noteworthy. If the generation of missing values as a result of
performing arithmetic operations on missing values is noteworthy, then
shouldn't
the generation of nonmissing values as a result of performing relational
operations on
missing values be noteworthy as well?
A little birdy has confided that some developers at SAS are considering
the possibility of adding a system option that when activated would
print a note when missing values are compared in boolean operations. No
results would change; the only change would be a new note in the log.
Conscientious SAS programmers would ensure that the note does not appear
in the log. Appearance of the note might flag a logic error. The note
could be
turned off by using N and NMISS logic as above.
Would such an option have helped you in the past?
What do you think about the change?
If this option is created, should it be on by default?
Would such a change be troubling or beneficial?
Any thoughts on the costs and benefits?
Other comments?
--
David M. Gardner
Analyst
FPG Child Development Institute
University of North Carolina at Chapel Hill
Chapel Hill, NC 27599-8185
davidm_gardner@unc.edu
|