| Date: | Fri, 29 Aug 2008 12:28:32 -0700 |
| Reply-To: | "Choate, Paul@DDS" <pchoate@DDS.CA.GOV> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | "Choate, Paul@DDS" <pchoate@DDS.CA.GOV> |
| Subject: | Re: Sounds Like... |
|
| In-Reply-To: | A<38925.150.142.232.4.1220034752.squirrel@webmail.albany.edu> |
| Content-Type: | text/plain; charset="us-ascii" |
My guess would be on the second the SQL processor computes the soundex
scores and then matches, hence a optimized join, while the first matches
and then computes the soundex scores, hence a Cartesian join.
As expected the second runs much more quickly. Interesting point Mike.
Paul Choate
DDS Data Extraction
(916) 654-2160
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Mike Zdeb
Sent: Friday, August 29, 2008 11:33 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Sounds Like...
hi ... so if the SOUNDS LIKE operator compares the SOUNDEX of variables,
I don't understand why
the first use of PROC SQL below produces the NOTE
NOTE: The execution of this query involves performing one or more
Cartesian product joins that can
not be optimized.
while the second on does not
can any of you SQL folk explain? thanks
data a;
input name1 : $10. @@;
datalines;
SMITH SMYTHE O'HENRY OHENRY ZDEB ZDYB SOO SU CLINTON CLINTEN
;
run;
data b;
set a (rename=(name1=name2));
run;
proc sql;
create table matches as
select *
from a,b
where name1 =* name2;
quit;
proc sql;
create table matches as
select *
from a,b
where soundex(name1) = soundex(name2);
quit;
--
Mike Zdeb
U@Albany School of Public Health
One University Place
Rensselaer, New York 12144-3456
P/518-402-6479 F/630-604-1475
|