Date: Mon, 4 Jan 2010 18:35:19 -0500
Reply-To: Arthur Tabachneck <art297@NETSCAPE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Arthur Tabachneck <art297@NETSCAPE.NET>
Subject: Re: Data Validation/Cleansing Tool Query
I have to both agree and disagree with my colleagues however, for the most
part, I agree with everything they said.
Data validation is FAR from being trivial and, at the risk of offending some
of my colleagues, shouldn't be left solely to the responsibility of
Sure, you can write or buy routines for doing many of the tasks, but a lot
of validity checks require business knowledge that programmers might not
have and often require the talents of staff whose salaries are even higher
(believe it or not SAS programmers are not necessarily the highest paid
employees in some organizations).
Are correct codes used? Are entries reasonable and consistent? Over time can anomalies or unexpected patterns be identified?
Those questions are all components of data validity and can require anything
from a running and reviewing the results of a simple algorithm, to comparing
differences between statistical models based on samples from the data.
Who should do the work, I think, depends upon which specific task is being
done, the available staff, and the skills required.
On Mon, 4 Jan 2010 15:43:29 -0500, Jonathan Goldberg
>We are currently using SAS to do validation. We write programs to check
>things like ranges, all fields present, etc., etc..
>Since this is a clinical trials environment it is also necessary to check
>across records for visit squence, missing visits, etc..
>While we have a lot of this packaged into macros, it seems to me that
>there should be tools available that allow non-programmers to do a lot
>(preferably, all) of this. It seems a waste to need programmers to do
>something so low-level.
>Anyone have suggestions for products that might fill the bill?