Date: Tue, 19 Feb 2008 08:29:38 -0800
Reply-To: Peter <crawfordsoftware@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter <crawfordsoftware@GMAIL.COM>
Organization: http://groups.google.com
Subject: Re: Why does retain work faster conditionally?
Content-Type: text/plain; charset=ISO-8859-1
On Feb 19, 2:07 pm, EdHea...@WESTAT.COM (Ed Heaton) wrote:
> Art;
>
> I find it totally incomprehensible that the DATA step with the RETAIN statement in an IF block runs faster than the step without the IF block. After all, the RETAIN statement is a compile-time statement - not run-time.
>
> Here are my results on a much more modest machine (SAS 9.1.3 on a dual Pentium 3.40 GHz and 1 GB of RAM).
>
> Conditional RETAIN Unconditional RETAIN
> Real CPU Real CPU
> 13.50 1.18 11.75 1.18
> 13.95 0.99 14.12 1.42
> 16.89 0.84 15.67 1.56
> 13.29 1.17 15.23 1.39
> 13.48 0.95
> 15.82 1.35
> ----- --- ---- ---
> 13.73 1.08 medians 14.68 1.40
>
> As you can see, these are all over the place. We really can't discern a difference.
>
> I suspect that the SAS compiler is smart enough to see that the IF block has no executable statements in it and thus ignores it at run time. I haven't tested this.
>
> %macro test() ;
> %local ranUniOut ;
> %let ranUniOut = %sysFunc( ranUni(1687462) ) ;
> %put %nrStr(&ranUniOut)=&ranUniOut ;
> %local roundOut ;
> %let roundOut = %sysFunc( round(&ranUniOut) ) ;
> %put %nrStr(&roundOut)=&roundOut ;
> %if &roundOut
> %then %do ;
> %put Conditional RETAIN ;
> data want;
> if _n_ eq 1 then do;
> retain fname;
> end;
> set have;
> run;
> Proc sql ; Drop table want ; Quit ;
> %end ;
> %else %do ;
> %put Unconditional RETAIN ;
> data want;
> retain fname;
> set have;
> run;
> Proc sql ; Drop table want ; Quit ;
> %end ;
> %mEnd test ;
> %macro runTests(times) ;
> %do i=1 %to × ;
> %test()
> %end ;
> %mEnd runTests ;
> %runTests(10)
>
> Ed
>
> Edward Heaton, Senior Systems Analyst,
> Westat (An Employee-Owned Research Corporation),
> 1650 Research Boulevard, TB-286, Rockville, MD 20850-3195
> Voice: (301) 610-4818 Fax: (301) 294-2085
> mailto:EdHea...@Westat.com http://www.Westat.com
>
>
>
> -----Original Message-----
> From: owner-sa...@listserv.uga.edu [mailto:owner-sa...@listserv.uga.edu] On Behalf Of Arthur Tabachneck
> Sent: Tuesday, February 19, 2008 8:13 AM
> To: SA...@LISTSERV.UGA.EDU; Arthur Tabachneck
> Subject: Why does retain work faster conditionally?
>
> One of our most respected list members wrote me off-line, asking why in the world I would have suggested wrapping a retain statement within a condition.
>
> That is, given the following data:
>
> data have;
> input lname$ fname$;
> do i=1 to 1000000;output;end;
> cards;
> lname1 fname1
> lname2 fname2
> ;
>
> why write:
>
> data want;
> if _n_ eq 1 then do;
> retain fname;
> end;
> set have;
> run;
>
> instead of:
> data want;
> retain fname;
> set a;
> run;
>
> I know why I provided the solution, because it had better performance, but I could sure use some feedback explaining why that would be so.
>
> I initially wrote it correctly and, upon seeing that it worked slower than Jiann's SQL solution, tried to see if I could bypass reading the data (i.e., when _n_ eq 0).
>
> After I soon realized that wouldn't be possible, I ran the step as presented.
>
> Someone please explain to me why:
>
> 60 data want;
> 61 if _n_ eq 1 then do;
> 62 retain fname;
> 63 end;
> 64 set a;
> 65 run;
>
> NOTE: There were 2000000 observations read from the data set WORK.A.
> NOTE: The data set WORK.WANT has 2000000 observations and 3 variables.
> NOTE: DATA statement used (Total process time):
> real time 1.12 seconds
> cpu time 1.12 seconds
>
> runs almost 50% faster than:
> 56 data want;
> 57 retain fname;
> 58 set a;
> 59 run;
>
> NOTE: There were 2000000 observations read from the data set WORK.A.
> NOTE: The data set WORK.WANT has 2000000 observations and 3 variables.
> NOTE: DATA statement used (Total process time):
> real time 1.43 seconds
> cpu time 1.43 seconds
>
> I ran the tests on a 4-processor Window's 2003 system with 12 gig of ram and SAS 9.1.3. It was during a holiday, thus I was the only one using the computer and I re-ran the tests 3 times with the same results.
>
> Art
> --------
> On Mon, 18 Feb 2008 23:21:23 -0500, Arthur Tabachneck <art...@NETSCAPE.NET> wrote:
>
> >Miguel,
>
> >As Jiann indicated, you can do what you want with proc sql. However,
> >you can also accomplish the same thing in a data step. For example,
>
> >data have;
> > input lname$ fname$;
> > do i=1 to 1000000;output;end;
> > cards;
> > lname1 fname1
> > lname2 fname2
> > ;
>
> >data want;
> > if _n_ eq 1 then do;
> > retain fname;
> > end;
> > set have;
> >run;
>
> >HTH,
> >Art
> >---------
> >On Tue, 19 Feb 2008 02:55:04 +0000, Miguel de la Hoz
> ><miguel_...@YAHOO.ES>
> >wrote:
>
> >>I am starting my problem with the following disposal of my dataset:
>
> ># variable
> >1 lname
> >2 fname
>
> >I am trying to export it to excel but it is keeping that order. I would
> >like to be able to write
>
> ># variable
> >1 fname
> >2 lname
>
> >This is only an example my dataset contains around 20 fields.
>
> >Thanks.
>
> >MDH.
>
> >______________________________________________
> >¿Con Mascota por primera vez? Sé un mejor Amigo. Entra en Yahoo!
> >Respuestashttp://es.answers.yahoo.com/info/welcome- Hide quoted text -
>
> - Show quoted text -
I think the issue may be platform dependent.
On z/OS SAS913 sp4 using Art's code
* why write: ;
data want;
if _n_ eq 1 then do;
retain fname;
end;
set have;
run;
* instead of: ;
data want;
retain fname;
set have;
run;
I ran each step 3 times, with these results
(showing only CPU seconds)
always retain on _n_ = 1 only
1.09 1.10
1.08 1.12
1.07 1.10
On z/OS it seems the difference is hardly measureable
PeterC
|