Date: Wed, 21 Apr 1999 11:59:42 -0400
Reply-To: "Paul M. Dorfman" <sashole@EARTHLINK.NET>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: "Paul M. Dorfman" <sashole@EARTHLINK.NET>
Organization: None
Subject: Re: How do I know if a dataset is allready sort when using PROC
Sort
Content-Type: text/plain; charset=koi8-r
Lars,
It will not go any faster if you check whether the dataset has already been
sorted by PROC SORT or not. Simply write the PROC SORT statement. If the
dataset
has already been sorted by the variables you specify, PROC SORT will issue the
corresponding message to this effect and bypass sorting altogether. Note that
PROC SORT is smart enough to to do it if the dataset is sorted, for instance,
by
A B C D, and you are requesting A B C order.
A situation is different, of course, if your dataset may have been physically
ordered by the variables in question, yet it has not been explicitly sorted by
PROC SORT. For instance, it may have been created from an already ordered flat
file or another SAS dataset. In this case, no information about its having been
sorted is embedded into the descriptor, and PROC SORT has no means to find it
out beforehand, so it will commence on sorting. With a long satellite record
and
a lot of records, it may be an exercise more expensive than an explicit DATA
step check. So, if you have a good reason to suspect that dataset A is
pre-ordered by variable A, something along the lines (it is easy to modify if
you have a by-list -- would be a good macro exercise, too):
%MACRO CONDSORT (DSN=A,SORTVAR=A);
DATA _NULL_;
SORTED = '1';
DO UNTIL (END);
SET &DSN (KEEP=&SORTVAR);
IF &SORTVAR < PREVA THEN DO;
SORTED = '0';
LEAVE;
END;
ELSE PREVA = &SORTVAR;
END;
CALL SYMPUT('SORTED',SORTED);
RUN;
%IF NOT &SORTED %THEN %DO;
PROC SORT DATA=&DSN; BY &SORTVAR; RUN;
%END;
%MEND CONDSORT;
%CONDSORT
may be much less expensive, on the average, than just letting PROC SORT go.
However, if your record is not too long and especially if you have PROC
SYNCSORT
installed, you may as well want to just let it run.
Kind regards,
Paul M. Dorfman
Jacksonville, FL
Lars Cortsen wrote:
>
> Hi Group
>
> To make this go faster I would like to test if a dataset already has been
> sorted. This I will do before the submit continue statement. Is the a
> system or dataset variable I can test against?
>
> ************************************************
> rc = close(sfile);
> sfile = 0;
> if ex = 1 then return;
> if sysrc = -1 then
> return;
>
> submit continue;
> *** rsubmit;
> data a;
> set &sds; ----->&sds=large dataset(2 mill)
> where &where1 &where2;
> &algo1; --? Variable containing something like this : varname=1000
> run;
> proc sort data= &sds;
> by unique;----? key;
> proc sort data=a;
> by unique;
> run;
> data &sds;
> merge &sds a;
> by unique;
> run;
> * DM 'CLEAR LOG';
> *** endrsubmit;
> endsubmit;
> RETURN;
>
> Thanks
> Lars Cortsen
|