LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2001, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 17 May 2001 10:03:33 -0400
Reply-To:   Sigurd Hermansen <hermans1@WESTAT.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Sigurd Hermansen <hermans1@WESTAT.COM>
Subject:   Re: PROC MATCH
Comments:   To: Charles Patridge <Charles_S_Patridge@PRODIGY.NET>

For what it's worth, you may certainly have my permission to post the little program that I wrote to illustrate fuzzy linkage/matching using a highly simplified scoring method. Some may find my example of how to use the SAS SPEDIS() function (cribbed almost directly from SI documentation) useful as they begin to experiment with fuzzy key linkage. I still feel compelled to warn anyone who does experiment with it that it forms a Cartesian product of a table and its own image. If applied to tables of more than a few thousand rows, it will likely blow up.

A couple of years ago or so I posted a SAS macroprogram that generated weights based on frequencies for elements elements of linkage keys and used the weights to calculate similarity scores per record pair. That program includes a section in which the user can specify blocking variables that SAS SQL can use to form an index. Anyone considering a fuzzy or probabilistic record linkage/matching project needs to understand blocking strategies and how to use them. Sig


Back to: Top of message | Previous page | Main SAS-L page