LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2000, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 10 Apr 2000 09:54:01 -0400
Reply-To:     Steve Giblin <Steve_Giblin@VAPOWER.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
Comments:     To: Victor Aina <aina@SFU.CA>
From:         Steve Giblin <Steve_Giblin@VAPOWER.COM>
Subject:      Re: Match merge problem
Comments: To: SAS-L@LISTSERV.VT.EDU
Content-type: text/plain; charset=us-ascii

Another approch you may wish to employ is SQL. This will allow you to grab multiples of partial matches, then you can devise a hierarchy for which matches are "best".

Note: You will have to be clear on your variable names You might also want to have a unique sequence number in each of the lists (ex SEQA, SEQB) to help in the dedupe process.

Something like

PROC SQL; CREATE TABLE MATCH AS SELECT A.*, B.* FROM LISTA A, LISTB B WHERE A.PHONEA = B.PHONEB OR A.LNAMEA = B.LNAMEB OR A.ADDR1A = B.ADDR1B

OR ...

; QUIT;

* If you have the name components parsed in one file but not the other ; * you can employ something like

OR INDEX(FULLNAMA,TRIM(LNAMEB))>0

* Same thing for address components

OR INDEX(ADDR1A,TRIM(HOUSENMB))=1 OR INDEX(ADDR1A,TRIM(STREETB))>0

This SQL method will allow you to grab all of the possible matches and then you will have to read through multiples and settle upon what is "best". You will need to makes sure that you do not apply too loose of an OR condition, otherwise you will get too many. For example, if you simply allow for a lookup of "Smith", on a last name match, you will get many unwanted matches. But if you do "Smith" AND parts of the address, you can manage your result set.

Hope this adds another option for you.

Steve


Back to: Top of message | Previous page | Main SAS-L page