Date: Mon, 28 Nov 2005 09:28:56 -0500
Reply-To: "Fehd, Ronald J" <rjf2@CDC.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Fehd, Ronald J" <rjf2@CDC.GOV>
Subject: Re: Data Cleaning Books
Content-Type: text/plain; charset="us-ascii"
> From: toby dunn
> Does anyone have any favorite data cleaning or Data Quality
> Management books
> other than Ron Cody's book that they would like to recommend?
> I think I have started going way beyond Ron's book.
What is the scope of your Questions?
* how to identify stuff?
* what to do with this stuff?
* how to update the stuff in our data sets?
In my own work I resolved 80% of my interminable questions
* the data collection form
* the data dictionary
* and a freq of all variables
see the quote, which is the summary
of my decade of data cleansing.
Ron Fehd the macro maven CDC Atlanta GA USA RJF2 at cdc dot gov
Your task is simple: remove the difference
between how things should be
and how they really are.
-- Ashleigh Brilliant pot-shot #4247
got user-defined formats?
then 80% of -your- job is done.
80% of -somebody- else's job is to review the reports.
%INVALID: a data review macro
using proc FORMAT option other=INVALID to identify and list outliers