|
On Jan 28, 2009, at 5:05 pm, Joe Matise wrote:
> The problem I kept hitting with lag is that lag is defined based on
> the
> source dataset, so you can't really use it the way you want to
> (which is to
> see if the previous record was deleted). You also can't use it
> conditionally; you need to define a variable with the lag value
> first, and
> then use that variable conditionally.
Two misconceptions here:
1) LAG is defined based on previous calls to the same LAG, not on
what's in the source data set.
2) You can call LAG conditionally; it's not often done correctly, but
I recall seeing an example once.
--
Jack Hamilton
jfh@alumni.stanford.org
Videtis illam spirare libertatis auram
On Jan 28, 2009, at 5:05 pm, Joe Matise wrote:
> The problem I kept hitting with lag is that lag is defined based on
> the
> source dataset, so you can't really use it the way you want to
> (which is to
> see if the previous record was deleted). You also can't use it
> conditionally; you need to define a variable with the lag value
> first, and
> then use that variable conditionally.
>
> So:
> lagid = lag(first.id);
> if lagid ... etc.
>
> Of course you still need to figure out if the previous iteration was
> deleted, which is the difficulty. I think the DOW loop, or perhaps
> double
> DOW loop, is necessary here to accomplish the precise task you set (my
> solutions generally do the deletions, but they don't give a first_id
> value,
> though you certainly could get first.id later on.)
>
> -Joe
>
> On Wed, Jan 28, 2009 at 6:59 PM, Arthur Tabachneck <art297@netscape.net
> >wrote:
>
>> Joe,
>>
>> Much appreciated. I thought of another way, during dinner, but also
>> not "quite" perfect:
>>
>> data want;
>> set test;
>> by id;
>> if lag(first.id) eq 1 then first.id=1;
>> deleted=0;
>> if first.id then do;
>> if x eq 1 then do;
>> delete;
>> end;
>> end;
>> first_id=first.id;
>> run;
>>
>> Art
>> -------
>> On Wed, 28 Jan 2009 18:45:42 -0600, Joe Matise <snoopy369@GMAIL.COM>
>> wrote:
>>
>>> You certainly can't manipulate it the way you suggest directly, as
>>> that
>>> would be inconsistent with how a data step works.
>>>
>>> The ways I could imagine doing something like that:
>>>
>>> -- Use a counter:
>>> data
>>> have;
>>>
>>> input id
>>> x;
>>>
>>> cards;
>>>
>>> 1
>>> 1
>>>
>>> 1
>>> 1
>>>
>>> 1
>>> 3
>>>
>>> 1
>>> 4
>>>
>>> 2
>>> 1
>>>
>>> 2
>>> 2
>>>
>>> 2
>>> 3
>>>
>>> 3
>>> 2
>>>
>>> 3
>>> 3
>>>
>>> 3
>>> 4
>>>
>>> ;
>>>
>>> run;
>>>
>>>
>>>
>>> proc sort
>>> data=have;
>>>
>>> by
>>> id;
>>>
>>> run;
>>>
>>>
>>>
>>> data
>>> have2;
>>>
>>> set
>>> have;
>>>
>>> by
>>> id;
>>>
>>> if first.id then
>>> counter=0;
>>>
>>> counter+1;
>>>
>>> run;
>>>
>>>
>>>
>>>
>>>
>>> data
>>> want;
>>>
>>> set
>>> have2;
>>>
>>> by
>>> id;
>>>
>>> retain
>>> firstid;
>>>
>>> if first.id then
>>> firstid=1;
>>>
>>> if counter = firstid then
>>> do;
>>>
>>> if x eq 1 then
>>> do;
>>>
>>>
>>> firstid=firstid+1;
>>>
>>>
>>> delete;
>>>
>>>
>>> end;
>>>
>>> end;
>>>
>>> run;
>>> *yields 7 records, deleting the first two records in id=1 and the
>>> first
>>> record in id=2.
>>>
>>> -- Use a retained tracker (same HAVE dataset):
>>>
>>>
>>> data
>>> want;
>>>
>>> set
>>> have;
>>>
>>> by
>>> id;
>>>
>>> retain
>>> nodelete;
>>>
>>> if first.id then
>>> nodelete=0;
>>>
>>> if nodelete=0 then
>>> do;
>>>
>>> if x eq 1 then
>>> do;
>>>
>>>
>>> delete;
>>>
>>>
>>> end;
>>>
>>> else
>>> nodelete=1;
>>>
>>> end;
>>>
>>> run;
>>>
>>> -- DOW loop:
>>>
>>>
>>>
>>>
>>>
>>>
>>> data
>>> want;
>>>
>>> do y=1 by 1 until (last.id or x ne
>>> 1);
>>>
>>> set
>>> have;
>>>
>>> by
>>> id;
>>>
>>> end;
>>>
>>> drop
>>> y;
>>>
>>> run;
>>>
>>> though I'm not 100% sure why this works, I was expecting it to not
>>> work at
>>> all, but it at least works for the example (and also for the
>>> example where
>>> x=1 further down from first.id, it keeps that record). But it
>>> does seem
>> to
>>> work.
>>>
>>> -- Perhaps use LAG, though I can't come up with a functional
>>> example.
>>>
>>> -Joe
>>>
>>>
>>>
>>> On Wed, Jan 28, 2009 at 6:06 PM, Arthur Tabachneck
>> <art297@netscape.net>wrote:
>>>
>>>> Akshaya asked a question earlier today that made me think of a
>>>> solution
>>>> that required a manipulative first.variable feature. However, I
>>>> don't
>>>> know how one could accomplish that. For example, given the data
>>>> file:
>>>>
>>>> data have;
>>>> input id x;
>>>> cards;
>>>> 1 1
>>>> 1 2
>>>> 1 3
>>>> 1 4
>>>> 2 1
>>>> 2 2
>>>> 2 3
>>>> 3 2
>>>> 3 3
>>>> 3 4
>>>> ;
>>>>
>>>> how can one, or can one, reset first.id if, because of a delete, it
>> isn't
>>>> the first record anymore? For example:
>>>>
>>>> data want;
>>>> set test;
>>>> by id;
>>>> if first.id then do;
>>>> if x eq 1 then do;
>>>> delete;
>>>> /* reset first.id here*/;
>>>> end;
>>>> end;
>>>> first_id=first.id;
>>>> run;
>>>>
>>>> would (if the missing command were added) end up looking like:
>>>>
>>>> data desired_want;
>>>> input id x first_id;
>>>> cards;
>>>> 1 2 1
>>>> 1 3 0
>>>> 1 4 0
>>>> 2 2 1
>>>> 2 3 0
>>>> 3 2 1
>>>> 3 3 0
>>>> 3 4 0
>>>> ;
>>>>
>>>> Just wondering,
>>>> Art
>>>>
>>
|