(And you should use quit not run to terminate proc datasets).
So which is faster?
On Wed, 25 Feb 2004 20:27:20 -0500, Howard Schreier
>You have unintentionally stacked the deck.
>PROC SQL exploits the index and uses a direct access strategy. It only
>reads the rows it needs from A.
>The DATA step uses the index to retrieve the rows in order, but it still
>reads all 5 million of them.
>On Wed, 25 Feb 2004 18:57:21 -0300, rdporto1 <rdporto1@TERRA.COM.BR> wrote:
>>Ok. That's the code and some log.
>>Sincerely, I don't know why this is happening...
>>data b (compress=yes reuse=yes);
>>do b1=1 to 5000000;
>>if b1 in (10 20 30) then x=4;
>>proc datasets library=work;
>>index create x;
>>do a1=1 to 500;
>>proc sort data=a;
>>create table d as
>>from a as a, b as b
>>merge b a (in=k);
>>proc compare data=c compare=d;
>>359 proc sql;
>>360 create table d as
>>361 select *
>>362 from a as a, b as b
>>363 where a.x=b.x;
>>WARNING: Variable x already exists on file WORK.D.
>>NOTE: Table WORK.D created, with 503 rows and 10 columns.
>>NOTE: PROCEDURE SQL used:
>> real time 0.01 seconds
>> cpu time 0.01 seconds
>>365 data c;
>>366 merge b a (in=k);
>>367 by x;
>>368 if k;
>>NOTE: There were 5000000 observations read from the data set WORK.B.
>>NOTE: There were 500 observations read from the data set WORK.A.
>>NOTE: The data set WORK.C has 503 observations and 10 variables.
>>NOTE: DATA statement used:
>> real time 9.71 seconds
>> cpu time 9.65 seconds