Date: Wed, 25 Sep 1996 16:26:45 +0100
Reply-To: John Whittington <johnw@MAG-NET.CO.UK>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: John Whittington <johnw@MAG-NET.CO.UK>
Subject: Re: SQL vs. DATA step
On Wed, 25 Sep 1996, Bruce Rogers <gxx18300@GGR.CO.UK> wrote (in part):
>On the SQL vs DATA debate, I re-ran your examples (1 million recs, 4
>vars) with some interesting results, showing the SQL code as MORE
>efficient, not less. The use of a WHERE option on the SET statement
>brings the two methods even. These tests were run on Windows 3.1, V6.11
>wave 2 (TS040), so I don't know if there's been a sudden improvement in
>SQL, or what!
Bruce, if there's been a sudden improvement, it must have been between 6.11
TS020 and TS040 (which I very much doubt) - since my tests (which showed at
least a 2:1 advantage in favour of the DATA step) were run with 6.11 TS020
(TSO40 CD-ROM being on my desk awaiting installation).
Another point is that the traditional view seems to be that SQL starts
loosing against DATA step code more dramatically when the datsets get
'large'; the differences I am seeing are fairly consistent across all sizes
of dataset I have examined.
Since it seems that I/O is often responsible for a large proportion of the
total execution time of simple DATA steps, I wonder if SQL is perhaps
particularly 'bad' at I/O - if not, then the actual computation time
(excluding I/O) for SQL would, with my figures, be dramatically worse than
that for the DATA step.
Regards
John
John
-----------------------------------------------------------
Dr John Whittington, Voice: +44 1296 730225
Mediscience Services Fax: +44 1296 738893
Twyford Manor, Twyford, E-mail: johnw@mag-net.co.uk
Buckingham MK18 4EL, UK CompuServe: 100517,3677
-----------------------------------------------------------
|