Date: Tue, 22 Oct 2002 21:57:16 -0400
Reply-To: Sigurd Hermansen <hermans1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <hermans1@WESTAT.COM>
Subject: Re: Querying Data without Replacement - Proc SQL?
"...pick a y, then
exlude that y from the possible matches the subsequent x's can choose..."
The adjective 'subsequent' tells me immediately that your problem does not
have a SQL solution. SQL applies the same transformation to each row of
data independently, and does not condition on outcomes of processing of
prior rows. Sorry.
An alternative definition of the problem may have a SQL solution. Does the
physical sequencing of the data in a file have any significance? Can you
make that relation explicit in the data?
Sig
On Tue, 22 Oct 2002 17:45:18 -0400, Brian Preslopsky <m1bkp00@FRB.GOV>
wrote:
>Let me give an example of my problem. Imagine a dataset:
>
>x y
>1 1
>1 5
>1 7
>2 1
>2 10
>2 17
>3 1
>3 20
>3 27
>
>For each x I want to pick y s.t. x-y = min(abs(x-y)). The simple
>resolution is that all x's pick y=1. The wrinkle is that y must be
>unique. So essentially I want to go through each x and pick a y, then
>exlude that y from the possible matches the subsequent x's can choose
>from.
>
>This example may seem simple, but I need to do this efficiently. I
>actually have a data set of about half a million observations. It was
>generated using proc sql, and I was looking for a sql solution. So far
>I have only come up with something using nested subqueries, but these
>use an impractical amount of computing time; on a smaller test dataset
>of only 4000 observations, I am already up to 10 minutes.
>
>I think I can come up with a macro to do this for me, it just seemed to
>me to be something I should be able to do with sql. I am not a big sql
>expert, so I might be missing something obvious.
>
>If anyone has done any kind of querying without replacement such as what
>I have described, I would be interested in hearing from you.
>
>Brian Preslopsky
|