Date: Tue, 1 Mar 2011 19:40:29 -0500
Reply-To: Arthur Tabachneck <art297@ROGERS.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Arthur Tabachneck <art297@ROGERS.COM>
Subject: Re: Fuzzy name matching
Sterling,
I don't have time to simplify all of the links for you, but here is one:
http://www.sconsig.com/sastips/tip00000.htm
The source code is at the bottom of the page.
Art
--------
On Tue, 1 Mar 2011 16:02:18 -0800, Sterling Paramore <gnilrets@GMAIL.COM>
wrote:
>I clicked on the "Fuzzy Logic Software" but couldn't find any software. I
>only saw a broken link to a paper.
>
>On Tue, Mar 1, 2011 at 3:56 PM, Arthur Tabachneck <art297@rogers.com>
wrote:
>
>> Sterling,
>>
>> I never suggested that the site was developed or enhanced according to
the
>> most current protocols. However, if you choose to ignore it because it
>> happens to be written in an old fashioned format, your loss!
>>
>> Much of the code on that site, in my opinion, is invaluable.
>>
>> Art
>> -------
>> On Tue, 1 Mar 2011 15:46:30 -0800, Sterling Paramore <gnilrets@GMAIL.COM>
>> wrote:
>>
>> >Woah, I feel like I just got back after traveling backward in time.
>> >
>> >You found code on that site?
>> >
>> >No offense, I appreciate your time and help, but that site is
impressively
>> >terrible.
>> >
>> >Circle wipes?
>> >
>> >On Tue, Mar 1, 2011 at 3:29 PM, Arthur Tabachneck <art297@rogers.com>
>> wrote:
>> >
>> >> Sterling,
>> >>
>> >> You can find some nice code via the following Google search:
>> >> fuzzy name site:www.sconsig.com
>> >>
>> >> Art
>> >> -------
>> >> On Tue, 1 Mar 2011 14:54:40 -0800, Sterling Paramore <
>> gnilrets@GMAIL.COM>
>> >> wrote:
>> >>
>> >> >Dear SAS-L,
>> >> >
>> >> >Before I start doing a more exhaustive google search, does anyone
know
>> >> some
>> >> >good references for some "fuzzy" name matching? The problem is that
>> I've
>> >> >got a list of member ids and names, something like
>> >> >
>> >> >id last first bday
>> >> >6 Smithers, Jr. Waylon 06/13/1952
>> >> >8 Smithers Waylon 06/13/1952
>> >> >7 Simpson Homer 10/31/1956
>> >> >4 Simpson Homer 11/31/1956
>> >> >1 Simpson Maggie 12/21/1988
>> >> >3 Simpson Margaret 12/21/1988
>> >> >2 Bouvier-Simpson Marjorie 03/15/1958
>> >> >5 Simpson Marge 03/15/1958
>> >> >9 Van Houten Milhouse 05/09/1980
>> >> >10 Houten Milhouse Van 05/09/1980
>> >> >
>> >> >I need to create a universal id that references the person, not just
>> the
>> >> >name. This is hard, because people can spell their names differently
>> on
>> >> >different forms (e.g., Maggie/Margaret or Marjorie/Marge), include or
>> >> >exclude hyphenation (Bouvier-Simpson/Simpson), keep or drop name
>> suffixes
>> >> >(Jr.), data entry forms can get confused (Van Houten/Milhouse Van),
or
>> >> even
>> >> >have simple typos (Homer's birthday). To name just a few! Creating
a
>> >> >universal id requires some fuzzy pattern matching to assign some kind
>> of
>> >> >likelihood that two name-birthdays actually refer to the same person.
>> >> > Anyone have any ideas of good places to start for solving this kind
of
>> >> >problem?
>> >> >
>> >> >Thanks,
>> >> >Sterling
>> >>
>>
|