Date: Fri, 23 May 2008 17:28:52 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: binary 'switch' variable
In-Reply-To: <8AD8F86B3312F24CB432CEDDA71889F205987F1D@ex06.GHCMASTER.GHC.ORG>
Content-Type: text/plain; charset="us-ascii"
I see the switches within groups of cellid in the example dataset but
not the requirements. Example data makes things easier to understand.
Thanks for providing them.
Roy's solution follows the example and presupposes that you have enough
disk space and run time to sort the full dataset. A lag1() function
would also work
well provided that you assign the lag to a new variable in one statement
and then include the new variable in a condition:
lagC=lag(cellid);
lagD=lag(depth);
if lagC=cellid then do; if lagD<0 and depth>=0 then rev=1;
else if lagD>=0 and depth<0 then rev=0;
end;
else rev=0;
I'd consider using a Data step view to define rev.
Of course the burden of sorting the original dataset and perhaps
maintaining more than one version becomes much heavier with increasing
numbers of observations. With more information about how you plan to use
the rev attribute, we might be able to suggest something more efficient.
S
-----Original Message-----
From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu]
On Behalf Of Pardee, Roy
Sent: Friday, May 23, 2008 3:28 PM
To: wyldsoul@GMAIL.COM; SAS-L@LISTSERV.UGA.EDU
Subject: RE: binary 'switch' variable
Maybe something like:
proc sort data = have ;
by cellid date ;
run ;
data want ;
retain _last_depth rev 0 ;
set have ;
by cellid ;
if first.cellid then do ;
rev = 0 ;
end ;
if _last_depth lt 0 and depth > 0 then rev = 1 ;
_last_depth = depth ;
run ;
Something like that anyway...
HTH,
-Roy
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
wyldsoul@GMAIL.COM
Sent: Friday, May 23, 2008 12:06 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: binary 'switch' variable
Hi,
I have a dataset with daily water depth values at different
locations. The full dataset has over 100 million obs. I am looking to
create a binary variable that acts as a switch when depth goes from a
negative value to a positive value. Starting with the first date and
location (cellid) in the data set, I want the new variable (call it
rev) to be 0 until the date when depth goes from a negative value to a
positive value. From that date forth, I want rev to =1. I've attached
a sample dataset below that is what I am looking for. Thank you for
your help.
Bryan
For example:
Date cellid depth rev
1/1/2008 12 1 0
1/2/2008 12 2 0
1/3/2008 12 3 0
1/4/2008 12 5 0
1/5/2008 12 2 0
1/6/2008 12 6 0
1/7/2008 12 4 0
1/8/2008 12 2 0
1/9/2008 12 -1 0
1/10/2008 12 -1 0
1/11/2008 12 -1 0
1/12/2008 12 -2 0
1/13/2008 12 -3 0
1/14/2008 12 -1 0
1/15/2008 12 1 1
1/16/2008 12 2 1
1/17/2008 12 3 1
1/18/2008 12 5 1
1/19/2008 12 4 1
1/20/2008 12 3 1
1/21/2008 12 3 1
1/22/2008 12 1 1
1/23/2008 12 -1 1
1/24/2008 12 -2 1
1/25/2008 12 -1 1
1/26/2008 12 -2 1
1/27/2008 12 1 1
1/28/2008 12 4 1
1/29/2008 12 2 1
1/30/2008 12 5 1
1/31/2008 12 6 1
1/1/2008 13 5 0
1/2/2008 13 7 0
1/3/2008 13 5 0
1/4/2008 13 4 0
1/5/2008 13 2 0
1/6/2008 13 -3 0
1/7/2008 13 -2 0
1/8/2008 13 -4 0
1/9/2008 13 2 1
1/10/2008 13 3 1
1/11/2008 13 4 1
1/12/2008 13 7 1
1/13/2008 13 7 1
1/14/2008 13 2 1
1/15/2008 13 -1 1
1/16/2008 13 -2 1
1/17/2008 13 1 1
1/18/2008 13 5 1
1/19/2008 13 1 1
1/20/2008 13 2 1
1/21/2008 13 3 1
1/22/2008 13 5 1
1/23/2008 13 4 1
1/24/2008 13 5 1
1/25/2008 13 6 1
1/26/2008 13 4 1
1/27/2008 13 -1 1
1/28/2008 13 -1 1
1/29/2008 13 -2 1
1/30/2008 13 -3 1
1/31/2008 13 -5 1