LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2005, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 27 Jun 2005 16:53:55 -0400
Reply-To:     "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject:      Re: Optimization question

1. I don't see anything to initialize PERMNO. Did you mean

if id in(&list);

? If not, the interaction of the subsetting IF and the "FIRST." tests may be problematic.

2. "Each id will have a [note singular] daily entry" suggests that FIRST.DATE will be true for every observation, in which case you don't have to test it and you don't even need DATE in the BY statement.

3. About how many different ID values are there in the data set?

On Mon, 27 Jun 2005 12:24:34 -0600, Michael Murff <mjm33@MSM1.BYU.EDU> wrote:

>Hi SAS-L, > > > >I'm accessing a very large dataset (6 gigs) with the following code: > > > >data subset; > > set huge(keep=date id var1-var5); > > where "01Jan1970"d <= date <= "31DEC2003"d; > > by id date; > > year=year(date); > > if permno in(&list); > > if first.date then > > do; > > var1_l = lag(var1); > > var2_l = lag(var2); > > end; > > if first.id then > > do; > > var1_l = .; > > var2_l = .; > > end; > >run; > > > >&list contains a list of 2000 ids (sorted) that I care about. Each id will >have a daily entry between the given dates. Huge dataset is already sorted >by ID and DATE. I need a more efficient way to run this datastep as it takes >several hours on our server. I have access to 8.2 and 9.1.3 SAS versions in >Unix environments. > > > >I tried putting &list in a compound where statement but I reach the 8.2 >where byte limit discussed recently on the -l (haven't tried this on 9.1.3 >yet). Does the by statement slow this down? And what about the subsetting if >statement. The final dataset "subset" should a few hundred MBs. I can write >a gig with our SCSI drives in about 15 minutes? so it seems like this little >dstep could be written to go faster. > > > >Thanks, > > > >Michael Murff > >Provo, UT


Back to: Top of message | Previous page | Main SAS-L page