LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2001, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 24 May 2001 18:15:38 -0400
Reply-To:   Jonathan Siegel <Jonathan.Siegel@PFIZER.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Jonathan Siegel <Jonathan.Siegel@PFIZER.COM>
Subject:   Re: Duplicate Records
Comments:   To: khan.m@GHC.ORG

It might help to do a quick cost-benefit assessment before getting down to work. What is the $ value (including goodwill) of solving this problem? (How much do you lose if folks get two pieces of mail? None at all? What is the likelihood of each occurring?)

If the benefit is high, there are many companies with sophisticated address- matching routines that offer better results than trying this oneself.

If the benefit is very low, your employer might be better off just letting the whole thing slide than paying to solve it.

Hope this helps,

Jonathan Siegel Pfizer Clinical Research and Development Ann Arbor Laboratories 734.622.3982

On Wed, 23 May 2001 23:15:35 -0700, Khan M. <khan.m@GHC.ORG> wrote:

>Hi, > >I hope someone can help. > >I have a dataset of name, address, etc., of about >1,000,000 records. I need to delete the duplicate records. >The only problem is that I need to standardize the address field. > > For Example: > >Street to St >Lane to Ln >P.O. Box to Box > >Any suggestions? > >Thanks >Khan


Back to: Top of message | Previous page | Main SAS-L page