Date: Mon, 14 Mar 2005 17:13:19 -0500
Reply-To: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject: Re: Lookahead [was: output to two different datasets based on a
pattern]
On Sat, 12 Mar 2005 00:15:16 -0800, Jack Hamilton <jfh@STANFORDALUMNI.ORG>
wrote:
>Here's a version which doesn't use BY. I think the code is simpler, but it
>might not be as flexible as yours:
>
>=====
>data lookahead1 (drop=nextk);
>
> set demo end=endmain;
>
> if not endnext then
> set demo (firstobs=2
> rename=(k=nextk val=nextval))
I think I would always include KEEP= coding here to avoid introducing
incorrect values for other variables.
> end=endnext;
>
> if (k ne nextk) or endmain then
If there are multiple variables defining groups, each one must be tested,
with "OR" links.
> nextval = .;
>
>run;
>=====
>
>Results are the same using your sample data. Neither solution actually
>answers the original question, but that's life on the internet.
>
>
>> -----Original Message-----
>> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On
>> Behalf Of Howard Schreier <hs AT dc-sug DOT org>
>> Sent: Friday, March 11, 2005 6:19 am
>> To: SAS-L@LISTSERV.UGA.EDU
>> Subject: [SAS-L] Lookahead [was: output to two different
>> datasets based on a pattern]
>>
>> As Jack implies, the technique using FIRSTOBS=2 gets messy
>> when there are
>> BY groups.
>>
>> Here is a model for a lookahead technique which works within
>> BY groups.
>>
>> data demo;
>> input k $ val @@ ;
>> cards;
>> a 11 a 12 a 13
>> b 21
>> c 31 c 32
>> d 41 d 42 d 43 d 44
>> ;
>>
>> data lookahead (drop=i);
>> set demo;
>> by k;
>> if (first.k or not last.k) then
>> do i = 1 to 1 + (first.k and not last.k);
>> set demo(keep=val rename=(val=nextval) );
>> end;
>> if last.k then do;
>> nextval = .;
>> end;
>> run;
>>
>> The second SET statement does the lookahead. It must read two
>> observations
>> at the start of a group but none at the end. The case where
>> there is only
>> one observation in the group must also be handled. So there is one
>> expression to decide whether the SET executes at all, and another
>> expression to determine how many times.
>>
>> Result:
>>
>> Obs k val nextval
>>
>> 1 a 11 12
>> 2 a 12 13
>> 3 a 13 .
>> 4 b 21 .
>> 5 c 31 32
>> 6 c 32 .
>> 7 d 41 42
>> 8 d 42 43
>> 9 d 43 44
>> 10 d 44 .
>>
>> It should be possible to generalize this for deeper
>> lookaheads, but it's
>> not trivial. LAST.K is not useful, because it's necessary to
>> detect second-
>> from-last etc. I think it has to be done using a Double DoW
>> so that the
>> size of each group can be determined and used in the controlling
>> expressions.
[snip]
|