Date: Mon, 7 Oct 1996 09:58:26 PDT
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
Subject: Re: Indexing annoyance
Jay, I'll propose a tolerable approach:
BY ID DATE2;
IF A AND B AND FIRST.ID;
The view NEWA will use the index to set A in your desired order. Because it is
a view, you do not have the I/O of creating an intermediate dataset.
I must caution you, if your observation is small relative to your block size, it
is faster to sort than to use the index to read.
Tim Berryhill - Contract Programmer and General Wizard
Frequently at Pacific Gas & Electric Co., San Francisco
The correlation coefficient between their views and
my postings is slightly less than 0
----------------------[Reply - Original Message]----------------------
Suppose dataset A is sorted by var ID then by DATE1 within each ID;
it's also indexed by var ID then by var DATE2 within each ID. There
are multiple observations for each combination ID,DATE1 and each
Dataset B contains just a short list of IDs, sorted by ID, with one
observation per ID.
Now I want to select the observations in A with the earliest value of
DATE2 for each ID that also appears in B. An obvious strategy is
something along the lines of:
data new; merge
if ina and inb and first.id;
The problem here is that the "natural" sort order of A will be used,
so that I'll end up with the earliest DATE1 instead of the earliest
DATE2. I need some way of specifying that SAS should use the index for
A instead of the sort order. I can of course use an extra sort or
datastep to make sure the cases are in the correct order, but this
seems to me (a) inelegant and (b) inefficient if A is large.