Hi

I just tried out this comparison of three methods for a relatively large dataset - about 5 Million numeric observations :


*CREATE TEST DATA*;
        DATA TEST(KEEP=I);
        DO I=1 TO 1000000;
        DO J=1 TO 10;
        IF RANUNI(0)>0.5 THEN OUTPUT;
        END;
        END;
        RUN;

*METHOD 1 : PROC SORT*;
        PROC SORT DATA=TEST OUT=TEST2 NODUPLICATES; BY I; RUN;

*METHOD 2 : PROC SQL*;
        PROC SQL;
        CREATE TABLE TEST3 AS
        SELECT DISTINCT I
        FROM TEST;
        QUIT;

*METHOD 3 : DATA STEP ARRAY*;
        DATA TEST4(KEEP=I);
        ARRAY TEMP{1000000} _TEMPORARY_;
        IF _N_=1 THEN DO;
                DO J=1 TO MAXJ;
                SET TEST NOBS=MAXJ POINT=J;
                TEMP{I}=SUM(TEMP{I},1);
                END;
        END;
        DO I=1 TO 1000000;
        IF TEMP{I}>0 THEN OUTPUT;
        END;
        STOP;
        RUN;

*FINALLY : CHECK RESULTS ARE ALL SAME*;
        DATA LAST;
        SET TEST2(RENAME=(I=I2));
        SET TEST3(RENAME=(I=I3));
        SET TEST4(RENAME=(I=I4));
        IF I2 NE I3;
        IF I2 NE I4;
        RUN;



From the log, the timings were as follows :

*METHOD 1 : PROC SORT*;
      real time           1:29.50
      cpu time            24.65 seconds

*METHOD 2 : PROC SQL*;
      real time           51.12 seconds
      cpu time            28.26 seconds

*METHOD 3 : DATA STEP ARRAY*;
      real time           36.26 seconds
      cpu time            6.57 seconds


So, method 3 was quickest for a large dataset - it's slowest for small datasets.

I ran this a handful of times - results fluctuated because usage of our server fluctuated - but got similar time savings with method 3.



Neale Parsons
Business Analyst
e-mail : Parsonsn@BUPA.com
Tel : 44 (0) 1784 891 265
Fax : 44 (0) 1784 891 281
BUPA, Commercial Analysis Team, Thameside House,
South Street, Staines, Middx, TW18 4TL



......................................................................
BUPA
the personal health service
.....................................................................

BUPA House, 15-19 Bloomsbury Way, London, WC1A 2BA

Visit: http://www.bupa.com
----------------------------------------------------------------------
Internet communications are not secure and therefore BUPA does
not accept legal responsibility for the contents of this message. Any
views or opinions presented are solely those of the author and do
not necessarily represent those of BUPA.
-----------------------------------------------------------------------