Date: Mon, 27 Oct 2008 14:00:47 -0400
Reply-To: Peter Flom <peterflomconsulting@mindspring.com>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Flom <peterflomconsulting@MINDSPRING.COM>
Subject: Finding nearest neighbors
Content-Type: text/plain; charset=UTF-8
Hello
I'd like to find the nearest neighbors for each of a large number of subjects in a multivariate space.
I found PROC DISTANCE and used it as follows
<<<
proc distance data = allout out = distance;
var interval(MAFAnT BAFAnT BRFAnT mif2f7a BAF1F7T
BAF2F8T BAF2F7T BAF1F7S BAF2F8S BAF7F8S
BFrF17S BFrF8ZS BRFAnB CoF7FZD MAF7T) ;
id core;
run;
>>>>
Now, the data set "distance" has 695 variables and 694 observations. It's a large, lower triangular matrix of distances. The rows are CORE numbers and the columns are the same CORE numbers, preceded by a _
e.g.
Obs core _00614D _08185C _08394B _08545A _08558A _08562A _08564A _08580A _08581A
1 00614D 0.0000 . . . . . . . .
2 08185C 10.1968 0.0000 . . . . . . .
3 08394B 6.7983 12.4716 0.0000 . . . . . .
4 08545A 13.0307 4.5231 16.1410 0.0000 . . . . .
5 08558A 7.1568 8.9766 8.1237 12.4371 0.0000 . . . .
6 08562A 7.1285 4.3278 9.7992 7.3852 6.5285 0.0000 . . .
with many more rows and columns.
Now, I'd like to get a data set with 695 observations, and (say) 3 variables NEAREST1 NEAREST2 and NEAREST3.
I figure this must have been done before, but I didn't find it ....
Thanks
Peter
Peter L. Flom, PhD
Statistical Consultant
www DOT peterflom DOT com