LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2004, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 17 Aug 2004 12:19:58 -0400
Reply-To:     harbourcharles@JOHNDEERE.COM
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Charles Harbour <harbourcharles@JOHNDEERE.COM>
Subject:      Re: Open Discussion: To what extent do you implement error
              checking/handling?
Content-Type: multipart/mixed;

Well of course it depends! Here's some of my criteria for determining whether it depends or not (and the various levels you speak of):

1) For data quality, What is the impact of not cleansing your data? Will it simply mean a typo that's easily understood and corrected? Or, do you use this particular variable for more downstream processing/calculations, such that if there's an error at the source, it will multiply upon aggregation? How important are those aggregate numbers?

2) For data processing, How important is clean data to the downstream processing? If your customers don't mind having bad data, then what's the big deal? OTOH, if it's imperative that your customers have correct data (thinking of say, a credit analysis, that will determine if your customer is approved for a car or home loan), then perhaps you should stop the process (or at least kick the questionable record out of processing for examination later) until the data can be cleansed. This is somewhat dependent on whether you're performing batch or online processing--no point in holding up an entire batch for one bad record (just throw it off to the side for someone to look at later), nor holding up an entire online process because the current record is bad.

From personal experience, the ratio of up front time spent in cleansing data is roughly 1 to 10, when compared with the amount of time spent cleaning data on the back end. It will save you many, many headaches to get your data clean as close to the source as possible, where you can make more intelligent inferences about how to fix your data, and not wait until you're several steps removed and wondering how (and with what qualifications) you will repair your bad apples.

On with the discussion!

CH

On Tue, 17 Aug 2004 10:26:41 -0400, M N <iced_phoenix_news@YAHOO.COM> wrote:

>Dear SAS-L, > >I would like to discuss the extent to which each of >you employs error checking/handling in your code, and >for what classes of errors. I am particularly >interested in how you employ error checking in >general-use macros (i.e. macros that may be used by >other programmers). > >For instance: > >* To what extent is the correctness of parameters the >responsibility of the caller, and to what extent is it >the responsibility of the macro? If I have a DATA= >parameter, the macro could: > >1.) Simply check that all characters in the variable >are alpha or a '.' >2.) Verify that the variable is a valid SAS data set >name >3.) Verify that the table actually exists >4.) Verify that the contents of the table meet the >specifications of the macro >5.) Leave one or more of these checks to the caller, >and let SAS produce an error message when the macro >tries to read the table in a data step > >* To what extent do you check the incoming contents of >your input data sets (i.e. as each obs is read by SET >in the data step loop) to match macro/program >specifications? > >* To what extent do you utilize SAS error/return code >macro vars such as the SYS* family, or the sysmsg() >function, etc? > >* Do any of you use SCL I/O functions rather than the >normal Base SAS statements to open, read, and process >data sets so that you get return codes at each >statement? > >* To what extent do you simply let SAS find >non-application specific errors (such as the >by-variables in a macro parameter not actually being >present in the data set) and print Errors/Warnings to >the log? Do you parse the log for such errors in the >program itself? Or do it as a separate step after the >program terminates? > >I realize that the answer to all of these questions is >"it depends", but I'd like to get some idea of what >other SAS programmers do in various production-quality >code situations. If you have other error handling >issues that you'd like to mention outside of the >questions that I posed, that's great--I'd simply like >a general discussion of these matters. > >Thanks, >Matt


AdmID:5019F384958F5E1D42D4F05F0C4203B3


Back to: Top of message | Previous page | Main SAS-L page