LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2004, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 3 Feb 2004 14:42:18 -0600
Reply-To:     Toby Dunn <tdunn@OAKHILLTECH.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Toby Dunn <tdunn@OAKHILLTECH.COM>
Subject:      Re: SQL speed
In-Reply-To:  <200402032032.i13KWaB30075@listserv.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"

I remember my mainframe admins saying that on our system it takes sorting a certain number of obs before our system would use syncsort, otherwise it would just use sas's sort. Wonder if the same is true weather or not im using a datastep or a sql statement. Hmmmmmm....yet another thind on a long list to ponder..

Toby Dunn

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Don Stanley Sent: Tuesday, February 03, 2004 2:33 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: SQL speed

Yes, as i recall when proc syncsort is installed any proc sort automagically becomes a proc syncsort. This is very sensible, I wouldn't want to be coding "proc syncsort" in my jobs and then have the company dictate we move to a different sort engine. I'd have a massive overhead in maintenance. I had neglected to mention this, and it does mean effectively only two choices for Proc SORT, SAS sort or proc syncsort. for SQL it also means two choices, SAS SORT or system syncsort.

I always understood proc syncsort to be a proc sort replacement, no intention on the part of the developers to use it with SAS SQL (which wasn't even around when I first used proc syncsort!). I wasn't clear about that in my earlier post.

Crux of my message doesn't change. PROC SYNCSORT is very worthwhile. There are situations where you cannot avoid sorting in SQL, otherwise with proc syncsort installed it just gets used with proc sort.

btw, in my earlier I referred to "disk speed". I meant "disk space".

Don

On Tue, 3 Feb 2004 10:06:04 -0600, pudding man <pudding_man@MAIL.COM> wrote:

>Don, > >Given that, in Alex SAS log: > >1.) The PROC SORT code yields the explicit message: > "NOTE: WER750I End PROC SYNCSORT. R2.2D+" > and the PROC SQL code apparently does not. > >2.) The PROC SORT code uses about 15% less cpu cycles. > >Isn't it possible that PROC SORT is automagically >linking to the PROC SYNCSORT engine and PROC SQL is >not (is rather using maybe the standard interface to the >standard SYNCSORT utility)? > >If this is the case, it could be due to >the engineering of the PROC SYNCSORT engine or >the engineering of the PROC SQL sort interface >or both? > >There are SAS system options that could likely shed >light on this issue ... SORTMSG and SORTMSG=ddname >come to mind ... > >I have long been curious about this but have no >suitable system on which to test. > > Zalut, > Puddin' > >******************************************************* >***** Puddin' Man **** Pudding_Man-at-mail.com ******** >*******************************************************; > >----- Original Message ----- >From: Don Stanley <don_stanley@PARADISE.NET.NZ> >Date: Mon, 2 Feb 2004 22:33:54 -0500 >To: SAS-L@LISTSERV.UGA.EDU >Subject: Re: SQL speed > >> Using PROC Syncsort, as opposed to Syncsort, should generally see sort >> times decrease. However, this is really not noticeable with smallish >> datasets like you have posted with. When I evaluated PROC SYNCSORT for the >> mainframe 2-3 years ago, I repeated success stories made earlier -- PROC >> SYNCSORT could at least halve sort times for very large datasets. >> >> When I asked for an explantion of how this happened, Syncsort gave a very >> simple and plausible answer which I assume to be correct. They said when >> SAS uses a system sort, it must first get data into a format suitable for >> that system sort. So there is an overhead on SAS to prepare data, and at >> the other end to get data back. But using PROC SYNCSORT (note -- proc >> syncsort, not the system syncsort product) they removed this step by >> writing an engine that directly accesses the SAS dataset. In fact, quoting >> from the Syncsort website seems to confirm this .. "PROC SYNCSORT achieves >> such gains by providing a direct interface between SAS and SyncSort. This >> frees SyncSort to use the high-performance techniques-sophisticated access >> methods, path length minimization, I/O optimization-that have made SyncSort >> the most frequently used sort product." >> >> There are effectively 3 sorts available to you (I'm referring to the poster >> here -- not the generic SAS user "you"). Using a system Syncsort, using SAS >> internal sort, and using PROC SYNCSORT. There is an internal option to >> control whether SAS uses its own sort or system sort. I'm not sure that the >> question of using SQL, as opoosed to say, PROC SORT, is really relevant, >> but I think you're getting at the difference between the "one step" proc >> sql approach, and the dual step sql and proc sort/syncsort approach. In >> this instance, I'd personally always consider the proc syncsort rather than >> automatically doing everything in sql. >> >> In my experience, the biggest gains were on datasets with lots of records >> and lots of variables, or where there were lots of records with long >> character variables. PROC SYNCSORT seemed to handle these with ease, whilst >> SAS sort seemed to perform best with small narrow datasets. I tended to try >> system sort and proc syncsort at first, but fairly soon I got to the point >> where I stuck with the proc routinely for large datasets (>500,000 records, >> or very wide records). >> >> Proc Syncsort also had a positive impact on disk speed. It used much less. >> >> If your SQL were a view ... and proc syncsort sorted the view output, you >> have no extra disk overhead, and it should run extremely quick ... and its >> sort of a one step thing (excuse the pun) ... >> >> Don >> >> >> On Mon, 2 Feb 2004 18:07:29 -0500, Alexander Cavallo <acavallo@LEXECON.COM> >> wrote: >> >> >Hi SAS-L, >> > >> >I am new to using SQL. From reading SAS-L, I thought that SQL sorts faster >> >than SYNCSORT. In the example below, SYNCSORT is faster. Could you >> >explain to me how big a dataset needs to be before SQL is faster? >> > >> >Thanks, >> > >> >--Alex Cavallo >> >Lexecon, Inc. >> > >> > >> >48 proc sql noprint; >> >49 create table temp.sql as ( >> >50 select distinct emp_id, date >> >51 from hrdata.all ) >> >52 order by emp_id, date >> >53 ; >> >NOTE: Table TEMP.SQL created, with 179291 rows and 2 columns. >> > >> >54 quit; >> >NOTE: The PROCEDURE SQL used 2.22 CPU seconds and 10730K. >> > >> >55 >> >56 proc sort nodupkey data=hrdata.all out=temp.test(keep=emp_id >> >date); >> >57 by emp_id date; >> >58 run; >> > >> >NOTE: WER750I End PROC SYNCSORT. R2.2D+ >> >NOTE: There were 179291 observations read from the data set HRDATA.ALL. >> >NOTE: The data set TEMP.TEST has 179291 observations and 2 variables. >> >NOTE: The PROCEDURE SORT used 1.88 CPU seconds and 10907K. >> > >-- >_______________________________________________ >Get your free email from http://www.mail.com

Yes, as i recall when proc syncsort is installed any proc sort automagically becomes a proc syncsort. This is very sensible, I wouldn't want to be coding "proc syncsort" in my jobs and then have the company dictate we move to a different sort engine. I'd have a massive overhead in maintenance. I had neglected to mention this, and it does mean effectively only two choices, SAS sort or syncsort.

I always understood proc syncsort to be a proc sort replacement, no intention on the part of the developers to use it with SAS SQL (which wasn't even around when I first used proc syncsort!). I wasn't about that in my earlier post.

Crux of my message doesn't change. PROC SYNCSORT is very worthwhile. There are situations where you cannot avoid sorting in SQL, otherwise with proc syncsort installed it just gets used.


Back to: Top of message | Previous page | Main SAS-L page