LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 6 Mar 2007 14:14:35 -0500
Reply-To:     Jack Clark <JClark@CHPDM.UMBC.EDU>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Jack Clark <JClark@CHPDM.UMBC.EDU>
Subject:      Re: Problem combining LAG with a BY variable
Comments: To: Paul Miller <pmiller@OHTN.ON.CA>
Content-Type: text/plain

Paul,

After reviewing the SAS online doc more carefully, I saw that it gave a good example of how to work with your type of problem.

Try something like this...

data test; infile cards missover; input site_id $ stop_date min_stop max_stop haart; cards; AA 10 11 12 1 AA 13 14 15 1 AA 16 17 18 0 BB 20 21 22 0 BB 23 24 25 1 BB 26 27 28 1 CC 30 31 32 1 DD 40 41 42 1 DD 43 44 45 0 DD 46 47 48 1 DD 49 50 51 1 ; run;

proc sort data = test; by site_id; run;

data new; set test (where=(haart=1)); by site_id;

LAG_STOP = LAG(STOP_DATE); LAG_MIN_STOP = LAG(MIN_STOP); LAG_MAX_STOP = LAG(MAX_STOP);

if first.site_id then do; lag_stop = .; lag_min_stop = .; lag_max_stop = .; end;

run;

It seems that you have to assign the values of your lag variables to missing after you calculate them (for the first observation of each ID).

Let us know if this works for you.

Jack Clark Research Analyst Center for Health Program Development and Management University of Maryland, Baltimore County

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Paul Miller Sent: Tuesday, March 06, 2007 1:53 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Problem combining LAG with a BY variable

Hi Jack,

SITE_ID is my id variable. And my data are sorted by SITE_ID. I've tried syntax like what you recommend but it's not working. I understand that lag retains the value that it had the last time it executed. So presumably it will retain this value even if I set the lag variables = . for the first site_id?

In other words, won't it say "Sure, you set these values to missing. But I still have the value from the last time I executed. And the next time I execute I'm going to write those values, even if they're from a previous site_id"?

At least this is what seems to be happening.

Thanks,

Paul

Paul J. Miller, Ph.D. Research Scientist and Statistician Ontario HIV Treatment Network 1300 Yonge St., Suite 308 Toronto, Ontario M4T 1X3 Phone: (416) 642-6486 ext 232 Fax: (416) 640-4245

-----Original Message----- From: Jack Clark [mailto:JClark@chpdm.umbc.edu] Sent: March 6, 2007 1:46 PM To: Paul Miller; SAS-L@LISTSERV.UGA.EDU Subject: RE: Problem combining LAG with a BY variable

Paul,

You mention patient ID in your post, but I only see SITE_ID in your code. Is SITE_ID the variable for patient ID?

If so...

You could initialize the 3 lag variables to missing each time you start a new value of SITE_ID...

if first.site_id then do; lag_stop = .; lag_min_stop = .; lag_max_stop = .; end;

else do; LAG_STOP = LAG(STOP_DATE); LAG_MIN_STOP = LAG(MIN_STOP); LAG_MAX_STOP = LAG(MAX_STOP); end;

Keep in mind that the use of first.site_id relies on the input data set being sorted by SITE_ID and the dataset being SET BY SITE_ID (as it was in your code).

Jack Clark Research Analyst Center for Health Program Development and Management University of Maryland, Baltimore County

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Paul Miller Sent: Tuesday, March 06, 2007 1:15 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Problem combining LAG with a BY variable

Hi Everyone,

I'm having what I suspect will be an easy problem for many. I want to calculate a set of lag variables within each value of a patient id. My syntax appears below. I'm having trouble with the three lag variables LAG_STOP, LAG_MIN_STOP, LAG_MAX_STOP. Currently, the lag function is carrying values from one id forward into the next instead of doing the lag within each value of id. What's the best way to fix this?

Thanks,

Paul

DATA HAART_INTERRUPT(DROP=EVENT_ID DURATION--HAART);

RETAIN SITE_ID START_DATE STOP_DATE

MIN_START MAX_START MID_START MIN_STOP MAX_STOP MID_STOP;

SET REGIMENS (WHERE=(HAART=1));

BY SITE_ID;

FORMAT START_DATE--MID_STOP MMDDYY8.;

MID_START = SUM(MIN_START,MAX_START)/2;

MID_STOP = SUM(MIN_STOP,MAX_STOP)/2;

LAG_STOP = LAG(STOP_DATE);

LAG_MIN_STOP = LAG(MIN_STOP);

LAG_MAX_STOP = LAG(MAX_STOP);

IF START_DATE NE . AND LAG_STOP NE . THEN DO;

STOP_DURATION = START_DATE - LAG_STOP;

MIN_STOP_DURATION = STOP_DURATION;

MAX_STOP_DURATION = STOP_DURATION;

END;

IF START_DATE NE . AND LAG_STOP = . THEN DO;

STOP_DURATION = .;

MIN_STOP_DURATION = START_DATE - LAG_MAX_STOP;

MAX_STOP_DURATION = START_DATE - LAG_MIN_STOP;

END;

IF START_DATE = . AND LAG_STOP NE . THEN DO;

STOP_DURATION = .;

MIN_STOP_DURATION = MIN_START - LAG_STOP;

MAX_STOP_DURATION = MAX_START - LAG_STOP;

END;

IF START_DATE = . AND LAG_STOP = . THEN DO;

STOP_DURATION = .;

MIN_STOP_DURATION = MIN_START - LAG_MAX_STOP;

MAX_STOP_DURATION = MAX_START - LAG_MIN_STOP;

END;

RUN;


Back to: Top of message | Previous page | Main SAS-L page