Date: Wed, 29 Apr 2009 16:29:25 -0700
Reply-To: "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Subject: Re: fuzzy match problem
In-Reply-To: A<200904292012.n3TK0TFS016343@malibu.cc.uga.edu>
Content-Type: text/plain; charset="utf-8"
Hi Terry -
In addition to the matching functions SPEDIS, COMPGED, COMPLEV, and SOUNDEX, you can also parse the strings and compare directly. One method would be to scan the words from the two strings and see how many match -
name1="10K WIZARD TECHNOLOGY LLC";
name2="10-K WIZARD TECHNOLOGY LLC";
MatchingWords=0; _j=1;
do while(not missing(scan(name1,_j)));
_i=1;
do while(not missing(scan(name2,_i)));
if scan(name1,_j)=scan(name2,_i) then MatchingWords+1;
put (_all_)(=);
_i+1;
end;
_j+1;
end;
TotalWords=(_j-1)><(_i+1);
Match=MatchingWords/TotalWords;
run;
Paul Choate
DDS Data Extraction
(916) 654-2160
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Terry He
Sent: Wednesday, April 29, 2009 1:12 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: fuzzy match problem
I have two variables. I am trying to match one variable to another.
For example, one list has “10-K WIZARD TECHNOLOGY LLC†and the other
has “10K WIZARD TECHNOLOGY LLCâ€. The vlookup function in excel will not
necessarily return the required result in this case.
how could I do it in SAS?
Here is some example data:
Var1 Var2
101 CALIFORNIA VENTURE @STAKE, INC
10K WIZARD TECHNOLOGY LLC 10-K WIZARD
TECHNOLOGY LLC
13D RESEARCH INC 1E LIMITED
2008 MIECF 29WEST INC.
2C COMERCIO E IMPORTACAO DE 3 TIER TECHNOLOGY
INC.
2K ADVISORS LLC 33-6 CONSULTANCY LTD
3 B CLIM 360 CONSULTING INC.
3 REASONS LTD. 360 RELOCATIONS LIMITED
3DADVISORS LLC 3SCOM Y.K.
3V CAPITAL LIMITED 3T SYSTEMS, INC
4 TABELIAO DE PROTESTO DE 4CAST LIMITED
401K COMPANY 5B TECHNOLOGIES CORP
A G EDWARDS INC 6FIGUREJOBS.COM LLC
A V ARKANSAS 7 CITY LEARNING LIMITED
AAA LAUNDROMAT 9-20 RECRUITMENT LTD.
AAA RESEARCHONE FINANCIAL A. EPSTEIN & SONS
INTERNATIONAL, INC.
ABATEX INDUSTRIA E COMERCIO A. PAPPAJOHN
COMPANY
ABG SUNDAL COLLIER INC A.S.A.INTERNATIONAL
HOLDINGS LIMITED
ABN AMRO HOLDING NV A1 EXPRESS DELIVERY
SERVICE INC.
|