LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 1997, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 29 Oct 1997 06:26:16 -0800
Reply-To:     TERJESON Mark <TERJEMW@DSHS.WA.GOV>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         TERJESON Mark <TERJEMW@DSHS.WA.GOV>
Subject:      Re: Any Good Book on Common Mistakes to avoid in SAS Code
Content-Type: text/plain; charset="iso-8859-1"

On Tue, 28 Oct 1997 21:01:49 GMT, krd@world.std.com (Kamal R Desai) wrote:

>I am looking forward for a book/publication which outlines common >mistakes / errors in SAS Codes (so that syntax is correct but the output >is not what was desired). >---snip

I have never seen such a book, but it sounds like a GREAT idea! Everyone most likely has a hard earned personal list of "Gotcha's" to share. Why don't we start a list right here in this thread, then we can all save it for reference after it becomes enormous. I'll contribute a few of my own personal favorites:

==========================================

READING PAST THE END OF AN INPUT RECORD: This usually bites me when I am reading files of variable length records and I get some short ones. If an INPUT statement implicitly or explicitly moves the column pointer beyond the current input record length, the default behaviour is to spill into the next input record. Unless you are parsing, this is nearly always the wrong thing to do.

SOLUTION: Always code the MISSOVER option on the INFILE statement. Then any variables that you try to read from the nonexistent part of a record are set to missing, which is exactly what they are. (IMHO this should have been the default).

=======================================================

OVERLAYED VARIABLE VALUES IN A MERGE: Ideally the only variable names in common among a list of MERGEd datasets should be the ones on the BY statement. When non-key variables of the same name exist in multiple datasets, the variable's value in the last matched dataset wins and the others are overlayed. That is how it is supposed to work, but it is often not the desired result. Even missing values can overlay nonmissing values (unless you use UPDATE rather than MERGE). This mistake can waste a lot of debugging time.

SOLUTION: Keep tight control of which variables you want to take from which datasets. Use KEEP lists, either in the creating data step or as a dataset option on the MERGE statement.

==========================================================

EQUALITY COMPARISONS WORK INCONSISTENTLY: Comparing fractional values for equality is hazardous, and not just because of roundoff errors. For example, 0.1 cannot be exactly expressed as a binary floating point value because in binary it is a repeating fraction. Rounding to a certain number of decimal places (other than zero) won't help because the result is still a binary floating point number on every platform that I have worked on.

SOLUTION: Use a "fuzz factor" if you must test equality of nonintegers. Rather than "IF X=Y" use "IF ABS(X-Y) < 1E-8". This let's you control "how equal is equal?" within the limits of precision of your particular platform. It's ugly, but it's just a fact of life when you do floating point arithmetic.

======================================================

I'll add a couple more easy ones to the list you folks started: ======================================================= MERGE NOT WORKING CORRECTLY, BUT CODE/LOGIC LOOKS OKAY:

SOLUTION: Don't forget you need a BY statement with the MERGE. ======================================================= (all kinds of symptoms)

SOLUTION: Missing semicolon

SOLUTION: Misplaced comment symbols ====================================================== THE DATASET YOU ARE READING GETS BLOWN AWAY: (I have never done this but was warned the first day I was taught SAS) SOLUTION: For those who place the SET statement right after the DATA statement and forget the semicolon on the DATA statement line. You would end up with at least three output datasets; one the same as your output filename, one dataset named "set", and your input dataset now written over. This is essentially having three dataset names listed after your DATA statement for output. =======================================================


Back to: Top of message | Previous page | Main SAS-L page