LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (October 2001, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 9 Oct 2001 13:34:08 +0100
Reply-To:   "PARSONS, Neale" <PARSONSN@BUPA.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "PARSONS, Neale" <PARSONSN@BUPA.COM>
Subject:   Re: grabbing the unique variable from a data set
Content-Type:   multipart/alternative ;

Hi

I just tried out this comparison of three methods for a relatively large dataset - about 5 Million numeric observations :

*CREATE TEST DATA*; DATA TEST(KEEP=I); DO I=1 TO 1000000; DO J=1 TO 10; IF RANUNI(0)>0.5 THEN OUTPUT; END; END; RUN;

*METHOD 1 : PROC SORT*; PROC SORT DATA=TEST OUT=TEST2 NODUPLICATES; BY I; RUN;

*METHOD 2 : PROC SQL*; PROC SQL; CREATE TABLE TEST3 AS SELECT DISTINCT I FROM TEST; QUIT;

*METHOD 3 : DATA STEP ARRAY*; DATA TEST4(KEEP=I); ARRAY TEMP{1000000} _TEMPORARY_; IF _N_=1 THEN DO; DO J=1 TO MAXJ; SET TEST NOBS=MAXJ POINT=J; TEMP{I}=SUM(TEMP{I},1); END; END; DO I=1 TO 1000000; IF TEMP{I}>0 THEN OUTPUT; END; STOP; RUN;

*FINALLY : CHECK RESULTS ARE ALL SAME*; DATA LAST; SET TEST2(RENAME=(I=I2)); SET TEST3(RENAME=(I=I3)); SET TEST4(RENAME=(I=I4)); IF I2 NE I3; IF I2 NE I4; RUN;

From the log, the timings were as follows :

*METHOD 1 : PROC SORT*; real time 1:29.50 cpu time 24.65 seconds

*METHOD 2 : PROC SQL*; real time 51.12 seconds cpu time 28.26 seconds

*METHOD 3 : DATA STEP ARRAY*; real time 36.26 seconds cpu time 6.57 seconds

So, method 3 was quickest for a large dataset - it's slowest for small datasets.

I ran this a handful of times - results fluctuated because usage of our server fluctuated - but got similar time savings with method 3.

Neale Parsons Business Analyst e-mail : Parsonsn@BUPA.com Tel : 44 (0) 1784 891 265 Fax : 44 (0) 1784 891 281 BUPA, Commercial Analysis Team, Thameside House, South Street, Staines, Middx, TW18 4TL

...................................................................... BUPA the personal health service .....................................................................

BUPA House, 15-19 Bloomsbury Way, London, WC1A 2BA

Visit: http://www.bupa.com ---------------------------------------------------------------------- Internet communications are not secure and therefore BUPA does not accept legal responsibility for the contents of this message. Any views or opinions presented are solely those of the author and do not necessarily represent those of BUPA. -----------------------------------------------------------------------


[text/html]


Back to: Top of message | Previous page | Main SAS-L page