| Date: | Sun, 13 May 2001 17:33:56 GMT |
| Reply-To: | dkb@CIX.COMPULINK.CO.UK |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | dkb@CIX.COMPULINK.CO.UK |
| Organization: | CIX - Compulink Information eXchange |
| Subject: | Re: Large Datasets |
|---|
Patricia Ledesma suggests:
> A tip I did not see mentioned (if pressed for disk space) is to use
PROC
> SQL instead of procedures that will require sorting. Below is an
example
> of PROC SQL instead of a PROC SUMMARY that dealt with an unusually
"long"
> file (269 million observations, 16 or 17 variables...
<snip>
> Instead of:
>
> proc sort data=n15 sortsize=max tagsort;
> by avgrank;
>
> proc summary data=n15 n;
> var avgrank;
> by avgrank;
> output out = statn15 n = numb;
<snip>
Why bother to sort when you don't need to - you mention using CLASS
variables with SUMMARY later in your posting, so why not use them here?
Then again, why use SUMMARY at all in this case? How about replacing
these steps with:
/* untested code - no licence on my home machine */
proc freq data=n15 noprint missing;
tables avgrank / out = statn15(rename=(count=numb) drop=percent);
run;
Does SQL still come out more efficient?
Kind regards,
Dave
.
|