Date: Wed, 30 Nov 2005 16:43:22 -0600
Reply-To: baogong jiang <bgjiang@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: baogong jiang <bgjiang@GMAIL.COM>
Subject: Re: help with LAG function
In-Reply-To: <592E8923DB6EA348BE8E33FCAADEFFFC13EED7FF@dshs-exch2.dshs.wa.lcl>
Content-Type: text/plain; charset=ISO-8859-1
Dan:
thank you for the detailed explanation
**
baogong
On 11/30/05, Nordlund, Dan <NordlDJ@dshs.wa.gov> wrote:
>
> > -----Original Message-----
> > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
> > baogong jiang
> > Sent: Wednesday, November 30, 2005 9:19 AM
> > To: SAS-L@LISTSERV.UGA.EDU
> > Subject: Re: help with LAG function
> >
> > here is the codes that works for me:
> >
> >
> > data new;
> > set old;
> > by id:
> > lagserv = lag(serv_from_date);
> > retain gap;
> > if first.id then gap=.;
> > else gap=input(put
> > (serv_from_date,8.),yymmdd8.)-input(put(lagserv,8.),yymmdd8.);
> > run;
> >
> > why the following not work?
> >
> > data new;
> > set old;
> > by id:
> > * lagserv = lag(serv_from_date);
> > retain gap; if first.id then gap=.;
> > else gap=input(put
> >
> (serv_from_date,8.),yymmdd8.)-input(put(lag(serv_from_date),8.),yymmdd8.);
> > run;
> >
> >
> >
> > thank you very much,
> > baogong
>
> Baogong,
>
> A common misunderstanding of the LAG function is that when it is used it
> somehow retrieves the value from the immediately preceding
> record. However,
> what actually happens is that each separate use of the LAG function sets
> up
> a separate first-in/first-out queue (FIFO). For example, the statement
>
> lagserve = lag(serv_from_date);
>
> creates a FIFO (at compile time I believe, initially populated with
> missing
> values) and each time it is executed at run time it retrieves the value at
> the front of the queue and then places the current value of serv_from_date
> at the end of the queue.
>
> The reason the code above doesn't work is that the lag function is not
> executed on first.id, so the value of serv_from_date for that record is
> not
> placed in the cue. When you call the lag function on the second record
> for
> the BY group, you will retrieve the value of serv_from_date that was the
> current value the last time you executed the lag function (which, in your
> case, will "usually" be from the last record of the previous BY group).
>
> That is why it is "dangerous" to conditionally execute the LAG function.
>
> One can get different size lags using LAGn where n is a number. For
> example, lag2(var) will set up a FIFO of size 2. It then functions at run
> time the same way as the lag function does (pull a value from the front of
> the queue, place current value at the end), the queue is just longer.
>
> As usual, if I have gotten anything wrong or misleading here, someone will
> be along shortly to correct it.
>
> Hope this is helpful,
>
> Dan
>
> Daniel J. Nordlund
> Research and Data Analysis
> Washington State Department of Social and Health Services
> Olympia, WA 98504-5204
>
>
>
--
Baoogng Jiang
Department of Agronomy
Lousisana State University
|