| Date: | Tue, 9 Oct 2001 13:34:08 +0100 |
| Reply-To: | "PARSONS, Neale" <PARSONSN@BUPA.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | "PARSONS, Neale" <PARSONSN@BUPA.COM> |
| Subject: | Re: grabbing the unique variable from a data set |
| Content-Type: | multipart/alternative ;
|
|---|
Hi
I just tried out this comparison of three methods for a relatively large
dataset - about 5 Million numeric observations :
*CREATE TEST DATA*;
DATA TEST(KEEP=I);
DO I=1 TO 1000000;
DO J=1 TO 10;
IF RANUNI(0)>0.5 THEN OUTPUT;
END;
END;
RUN;
*METHOD 1 : PROC SORT*;
PROC SORT DATA=TEST OUT=TEST2 NODUPLICATES; BY I; RUN;
*METHOD 2 : PROC SQL*;
PROC SQL;
CREATE TABLE TEST3 AS
SELECT DISTINCT I
FROM TEST;
QUIT;
*METHOD 3 : DATA STEP ARRAY*;
DATA TEST4(KEEP=I);
ARRAY TEMP{1000000} _TEMPORARY_;
IF _N_=1 THEN DO;
DO J=1 TO MAXJ;
SET TEST NOBS=MAXJ POINT=J;
TEMP{I}=SUM(TEMP{I},1);
END;
END;
DO I=1 TO 1000000;
IF TEMP{I}>0 THEN OUTPUT;
END;
STOP;
RUN;
*FINALLY : CHECK RESULTS ARE ALL SAME*;
DATA LAST;
SET TEST2(RENAME=(I=I2));
SET TEST3(RENAME=(I=I3));
SET TEST4(RENAME=(I=I4));
IF I2 NE I3;
IF I2 NE I4;
RUN;
From the log, the timings were as follows :
*METHOD 1 : PROC SORT*;
real time 1:29.50
cpu time 24.65 seconds
*METHOD 2 : PROC SQL*;
real time 51.12 seconds
cpu time 28.26 seconds
*METHOD 3 : DATA STEP ARRAY*;
real time 36.26 seconds
cpu time 6.57 seconds
So, method 3 was quickest for a large dataset - it's slowest for small
datasets.
I ran this a handful of times - results fluctuated because usage of our
server fluctuated - but got similar time savings with method 3.
Neale Parsons
Business Analyst
e-mail : Parsonsn@BUPA.com
Tel : 44 (0) 1784 891 265
Fax : 44 (0) 1784 891 281
BUPA, Commercial Analysis Team, Thameside House,
South Street, Staines, Middx, TW18 4TL
......................................................................
BUPA
the personal health service
.....................................................................
BUPA House, 15-19 Bloomsbury Way, London, WC1A 2BA
Visit: http://www.bupa.com
----------------------------------------------------------------------
Internet communications are not secure and therefore BUPA does
not accept legal responsibility for the contents of this message. Any
views or opinions presented are solely those of the author and do
not necessarily represent those of BUPA.
-----------------------------------------------------------------------
[text/html]
|