It might help to do a quick cost-benefit assessment before getting down to
work. What is the $ value (including goodwill) of solving this problem?
(How much do you lose if folks get two pieces of mail? None at all? What is
the likelihood of each occurring?)
If the benefit is high, there are many companies with sophisticated address-
matching routines that offer better results than trying this oneself.
If the benefit is very low, your employer might be better off just letting
the whole thing slide than paying to solve it.
Hope this helps,
Pfizer Clinical Research and Development
Ann Arbor Laboratories
On Wed, 23 May 2001 23:15:35 -0700, Khan M. <khan.m@GHC.ORG> wrote:
>I hope someone can help.
>I have a dataset of name, address, etc., of about
>1,000,000 records. I need to delete the duplicate records.
>The only problem is that I need to standardize the address field.
> For Example:
>Street to St
>Lane to Ln
>P.O. Box to Box