Date: Tue, 12 Mar 2002 15:19:17 -0500
Reply-To: Courtney Cook <ccook@MACROINT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Courtney Cook <ccook@MACROINT.COM>
Subject: address field - cleaning and separating
I am interested in finding a pre-written block of code to clean and
separate an address variable into its components. Ultimately, I would like
to separate the address variable into the following types of variables for
the purpose of finding duplicates:
street_number, street name, street type, po_box, and other_info (such as
suite_number).
For example, I would like to separate these
"123 N. Main Street, Suite 456"
"P.O. Box 789, 123 Main St. #456"
into the correct fields, so that "123" matches "123", "Main"
matches "Main", etc.
Does anybody know of any such code or any resources?
I have looked at the document by Charles Patridge, "The Fuzzy Feeling SAS
Provides: Electronic Matching of Records without Common Keys." I have also
seen the code on sconsig.com TIP00128a "Cleansing Macro, Data Scrubbing
routine (see tip 00128 for more), just one schema technique by Charles
Patridge." I don't know if they would be easy to implement and if there is
a better method now.
Thanks!
|