Date: Tue, 28 Jun 2005 14:37:34 -0400
Reply-To: "Dorfman, Paul" <Paul.Dorfman@FCSO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Dorfman, Paul" <Paul.Dorfman@FCSO.COM>
Subject: Re: Optimization question
Content-Type: text/plain; charset=iso-8859-1
Michael,
Methinks the thread has go
On Mon, 27 Jun 2005 12:24:34 -0600, Michael Murff <mjm33@MSM1.BYU.EDU> wrote:
>Hi SAS-L,
>I'm accessing a very large dataset (6 gigs) with the following code:
>
>data subset;
> set huge(keep=date id var1-var5);
> where "01Jan1970"d <= date <= "31DEC2003"d;
> by id date;
> year=year(date);
> if permno in(&list);
> if first.date then
> do;
> var1_l = lag(var1);
> var2_l = lag(var2);
> end;
> if first.id then
> do;
> var1_l = .;
> var2_l = .;
> end;
>run;
>
>&list contains a list of 2000 ids (sorted) that I care about. Each id will
>have a daily entry between the given dates. Huge dataset is already sorted
>by ID and DATE. I need a more efficient way to run this datastep as it takes
>several hours on our server. I have access to 8.2 and 9.1.3 SAS versions in
>Unix environments.
>
>I tried putting &list in a compound where statement but I reach the 8.2
>where byte limit discussed recently on the -l (haven't tried this on 9.1.3
>yet). Does the by statement slow this down? And what about the subsetting if
>statement. The final dataset "subset" should a few hundred MBs. I can write
>a gig with our SCSI drives in about 15 minutes? so it seems like this little
>dstep could be written to go faster.
>
>Thanks,
>
>Michael Murff
>
>Provo, UT
|