|
Paul,
Try specifying the LAGs first then set them to missing for first.site_id.
I think this works becuase the lags are performed on every record.
LAG_STOP = LAG(STOP_DATE);
LAG_MIN_STOP = LAG(MIN_STOP);
LAG_MAX_STOP = LAG(MAX_STOP);
if first.site_id then do;
lag_stop = .;
lag_min_stop = .;
lag_max_stop = .;
end;
-Brandon
On Tue, 6 Mar 2007 13:53:23 -0500, Paul Miller <pmiller@OHTN.ON.CA> wrote:
>Hi Jack,
>
>SITE_ID is my id variable. And my data are sorted by SITE_ID. I've tried
>syntax like what you recommend but it's not working. I understand that
>lag retains the value that it had the last time it executed. So
>presumably it will retain this value even if I set the lag variables = .
>for the first site_id?
>
>In other words, won't it say "Sure, you set these values to missing. But
>I still have the value from the last time I executed. And the next time
>I execute I'm going to write those values, even if they're from a
>previous site_id"?
>
>At least this is what seems to be happening.
>
>Thanks,
>
>Paul
>
>Paul J. Miller, Ph.D.
>Research Scientist and Statistician
>Ontario HIV Treatment Network
>1300 Yonge St., Suite 308
>Toronto, Ontario M4T 1X3
>Phone: (416) 642-6486 ext 232
>Fax: (416) 640-4245
>
>-----Original Message-----
>From: Jack Clark [mailto:JClark@chpdm.umbc.edu]
>Sent: March 6, 2007 1:46 PM
>To: Paul Miller; SAS-L@LISTSERV.UGA.EDU
>Subject: RE: Problem combining LAG with a BY variable
>
>Paul,
>
>You mention patient ID in your post, but I only see SITE_ID in your
>code.
>Is SITE_ID the variable for patient ID?
>
>If so...
>
>You could initialize the 3 lag variables to missing each time you start
>a
>new value of SITE_ID...
>
> if first.site_id then do;
> lag_stop = .;
> lag_min_stop = .;
> lag_max_stop = .;
> end;
>
> else do;
> LAG_STOP = LAG(STOP_DATE);
> LAG_MIN_STOP = LAG(MIN_STOP);
> LAG_MAX_STOP = LAG(MAX_STOP);
> end;
>
>Keep in mind that the use of first.site_id relies on the input data set
>being sorted by SITE_ID and the dataset being SET BY SITE_ID (as it was
>in
>your code).
>
>
>Jack Clark
>Research Analyst
>Center for Health Program Development and Management
>University of Maryland, Baltimore County
>
>
>-----Original Message-----
>From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
>Paul
>Miller
>Sent: Tuesday, March 06, 2007 1:15 PM
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: Problem combining LAG with a BY variable
>
>Hi Everyone,
>
>
>
>I'm having what I suspect will be an easy problem for many. I want to
>calculate a set of lag variables within each value of a patient id. My
>syntax appears below. I'm having trouble with the three lag variables
>LAG_STOP, LAG_MIN_STOP, LAG_MAX_STOP. Currently, the lag function is
>carrying values from one id forward into the next instead of doing the
>lag within each value of id. What's the best way to fix this?
>
>
>
>Thanks,
>
>
>
>Paul
>
>
>
>DATA HAART_INTERRUPT(DROP=EVENT_ID DURATION--HAART);
>
> RETAIN SITE_ID START_DATE STOP_DATE
>
> MIN_START MAX_START MID_START MIN_STOP
>MAX_STOP MID_STOP;
>
> SET REGIMENS (WHERE=(HAART=1));
>
> BY SITE_ID;
>
> FORMAT START_DATE--MID_STOP MMDDYY8.;
>
> MID_START = SUM(MIN_START,MAX_START)/2;
>
> MID_STOP = SUM(MIN_STOP,MAX_STOP)/2;
>
>
>
> LAG_STOP = LAG(STOP_DATE);
>
> LAG_MIN_STOP = LAG(MIN_STOP);
>
> LAG_MAX_STOP = LAG(MAX_STOP);
>
>
>
> IF START_DATE NE . AND LAG_STOP NE . THEN DO;
>
> STOP_DURATION = START_DATE -
>LAG_STOP;
>
> MIN_STOP_DURATION = STOP_DURATION;
>
> MAX_STOP_DURATION = STOP_DURATION;
>
> END;
>
>
>
> IF START_DATE NE . AND LAG_STOP = . THEN DO;
>
> STOP_DURATION = .;
>
> MIN_STOP_DURATION = START_DATE -
>LAG_MAX_STOP;
>
> MAX_STOP_DURATION = START_DATE -
>LAG_MIN_STOP;
>
> END;
>
>
>
> IF START_DATE = . AND LAG_STOP NE . THEN DO;
>
> STOP_DURATION = .;
>
> MIN_STOP_DURATION = MIN_START -
>LAG_STOP;
>
> MAX_STOP_DURATION = MAX_START -
>LAG_STOP;
>
> END;
>
>
>
> IF START_DATE = . AND LAG_STOP = . THEN DO;
>
> STOP_DURATION = .;
>
> MIN_STOP_DURATION = MIN_START -
>LAG_MAX_STOP;
>
> MAX_STOP_DURATION = MAX_START -
>LAG_MIN_STOP;
>
> END;
>
>RUN;
|