|
If you have SAS/OR, look at PROC ASSIGN.
On Fri, 29 Aug 2003 17:25:52 +0200, Hans Reitsma <j.reitsma@AMC.UVA.NL>
wrote:
>Dear all
>
>After a probabilistic record linkage operation of two files
>(A and B) we have a file with potential links, e.g. pairs of
>records from file A and B whose total weights are above a
>certain cut-off value. Some of these potential links are
>intertangled (is that the right word?) meaning that 1 record
>from file A can be linked to two different records of file
>B. The weights can be different, but they are both above the
>cut-off value. We know that only one record from file A (or
>B) can be linked to only one record in the other file. This
>example is easy to solve, the pair with the highest weight
>wins and becomes the link, the other becomes a non-link.
>
>However, more complex situations occur (see also example
>data). In these situations., I want to obtain the solution
>that maximises the total sum of weights of all the pairs
>that belong to that solution. This means that not
>automatically the link with the highest weight wins. Here
>are some sample data to clarify the issue.
>
>
>Cluster id_a id_b weight Desired result
>total weight of solution (links)
>1 1 3 11 non-link
>1 1 6 13 link
>13
>
>2 3 8 11 non-link
>2 4 9 11 link
>2 5 8 14 link
>2 5 9 13 non-link
>25
>
>
>
>Any help, suggestions?
>
>
>
>Hans Reitsma, MD PhD
>
>Dept. of Clinical Epidemiology & Biostatistics
>
>PO Box 22700, 1100 DE, Amsterdam, The Netherlands
>
>Phone: +31-20-5663273, Fax: +31-20-6912683
>
>E-mail: j.reitsma@amc.uva.nl
|