LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 6 Dec 2007 15:54:07 -0600
Reply-To:     "data _null_," <datanull@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "data _null_," <datanull@GMAIL.COM>
Subject:      Re: IF vs WHERE discussion
Comments: To: bruce johnson <chimanbj@gmail.com>
In-Reply-To:  <f3ed116f0712061350t79b74d9bk24715c818817477c@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

But, we need a program that we can pass around to generate a BIGFILE. So we can all use the same data.

On Dec 6, 2007 3:50 PM, bruce johnson <chimanbj@gmail.com> wrote: > The WORK library is local. And this file is a datafile that I use for > testing many different scenarios. It's filled with all types of > fields, not just random number fields. This probably the reason why > you see such a difference in the times. > > > On Dec 6, 2007 3:47 PM, data _null_, <datanull@gmail.com> wrote: > > For example creating a WORK.BIGFILE on my PC where WORK library is not > > "out on the network" is much faster. Real time very close to CPU. I > > realize my BIGFILE is not like yours. You have many more variables. > > > > 388 option fullstimer msglevel=i; > > 389 data work.bigfile; > > 390 do _n_ = 1 to 6331860; > > 391 sex = rantbl(12345,1/6,1/6,1/6,1/6, 1/6,1/6,1/6); > > 392 age = floor(abs(rannor(12345)*20) + 30); > > 393 output; > > 394 end; > > 395 run; > > > > NOTE: The data set WORK.BIGFILE has 6331860 observations and 2 variables. > > NOTE: DATA statement used (Total process time): > > real time 5.71 seconds > > user cpu time 4.96 seconds > > system cpu time 0.71 seconds > > Memory 152k > > > > > > > > On Dec 6, 2007 3:26 PM, bruce johnson <chimanbj@gmail.com> wrote: > > > Whenever you're writing data to disk, the real and CPU time will > > > differ because of the disk I/O. > > > > > > But since you requested it, here it is (putting the where clause in > > > the SET statement is the clear winner): > > > > > > 312 options fullstimer; > > > 313 data test; > > > 314 set saslib.bigfile(where=(sex=6 and age<50)); > > > 315 run; > > > > > > NOTE: There were 1812750 observations read from the data set SASLIB.BIGFILE. > > > WHERE (sex=6) and (age<50); > > > NOTE: The data set WORK.TEST has 1812750 observations and 76 variables. > > > NOTE: DATA statement used (Total process time): > > > real time 3:39.14 > > > user cpu time 6.66 seconds > > > system cpu time 10.01 seconds > > > Memory 223k > > > > > > > > > 316 data test; > > > 317 set saslib.bigfile; > > > 318 where sex=6 and age<50; > > > 319 run; > > > > > > NOTE: There were 1812750 observations read from the data set SASLIB.BIGFILE. > > > WHERE (sex=6) and (age<50); > > > NOTE: The data set WORK.TEST has 1812750 observations and 76 variables. > > > NOTE: DATA statement used (Total process time): > > > real time 4:04.42 > > > user cpu time 7.52 seconds > > > system cpu time 10.40 seconds > > > Memory 222k > > > > > > > > > 320 data test; > > > 321 set saslib.bigfile; > > > 322 if sex=6 and age<50; > > > 323 run; > > > > > > NOTE: There were 6331860 observations read from the data set SASLIB.BIGFILE. > > > NOTE: The data set WORK.TEST has 1812750 observations and 76 variables. > > > NOTE: DATA statement used (Total process time): > > > real time 3:48.71 > > > user cpu time 7.70 seconds > > > system cpu time 10.43 seconds > > > Memory 212k > > > > > > > > > > > > On Dec 6, 2007 1:48 PM, data _null_, <datanull@gmail.com> wrote: > > > > I'm not sure your sample is large enough. Can you post the code that > > > > makes BIGFILE and make it bigger. The OP mentioned 1-5M but you only > > > > have 0.5M > > > > > > > > Are you sharing you computer. It might be better to test when real > > > > and CPU time are closer. When your computer has less contention for > > > > resource. > > > > > > > > > > > > On Dec 6, 2007 1:10 PM, bruce johnson <chimanbj@gmail.com> wrote: > > > > > > > > > Chew on this... > > > > > 10 options fullstimer; > > > > > 11 data test; > > > > > 12 set saslib.bigfile; > > > > > 13 if sex=6 and age<50; > > > > > 14 run; > > > > > > > > > > NOTE: There were 422124 observations read from the data set SASLIB.BIGFILE. > > > > > NOTE: The data set WORK.TEST has 120850 observations and 76 variables. > > > > > NOTE: DATA statement used (Total process time): > > > > > real time 18.11 seconds > > > > > user cpu time 0.37 seconds > > > > > system cpu time 0.81 seconds > > > > > Memory 212k > > > > > > > > > > > > > > > 15 data test; > > > > > 16 set saslib.bigfile; > > > > > 17 where sex=6 and age<50; > > > > > 18 run; > > > > > > > > > > NOTE: There were 120850 observations read from the data set SASLIB.BIGFILE. > > > > > WHERE (sex=6) and (age<50); > > > > > NOTE: The data set WORK.TEST has 120850 observations and 76 variables. > > > > > NOTE: DATA statement used (Total process time): > > > > > real time 23.56 seconds > > > > > user cpu time 0.59 seconds > > > > > system cpu time 1.15 seconds > > > > > Memory 222k > > > > > > > > > > > > > > > > > > > > On Dec 6, 2007 12:46 PM, data _null_, <datanull@gmail.com> wrote: > > > > > > TEST don't speculate. OPTIONS FULLSTIMER; > > > > > > > > > > > > > > > > > > On Dec 6, 2007 12:41 PM, LWn <Lars.WahlgrenRemove@this.stat.lu.se> wrote: > > > > > > > Is there any difference between IF and WHERE when used in a data step? > > > > > > > Data set zero below has between 1 and 5 million records. > > > > > > > > > > > > > > data one ; > > > > > > > set zero ; > > > > > > > WHERE <condition1 AND condition2> ; > > > > > > > or > > > > > > > IF <condition1 AND condition2> ; > > > > > > > run ; > > > > > > > > > > > > > > I say there is NO difference regarding efficiency but a friend says WHERE is > > > > > > > more efficient. > > > > > > > I've made some simulations that supports my opinion. > > > > > > > > > > > > > > Do you all support my opinion or am I missing something? > > > > > > > > > > > > > > /LarsW > > > > > > > > > > > > > > > > > > > > > > > > > > > >


Back to: Top of message | Previous page | Main SAS-L page