| Date: | Fri, 28 Aug 1998 05:49:36 -0400 |
| Reply-To: | "Alexander J. Martinson" <alexmart@JUNIK.LV> |
| Sender: | "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU> |
| From: | "Alexander J. Martinson" <alexmart@JUNIK.LV> |
| Organization: | AJM, Inc. |
| Subject: | PROC SORT Alternative? |
| Content-Type: | text/plain; charset=koi8-r |
Dear SAS-L:
I was intrigued by some recent postings on sorting. Looks like in specific
cases a self-coded routine can run faster that PROC SORT. I am quite
interested because daily we receive hundreds of unsorted vendor files in
the form of SAS datasets. The files are then used as drivers in further
processing, so we sort them with NODUPKEY and eliminate out of range keys
in the WHERE clause. It takes lots of resources even though the datasets
contain 1 variable, ID, which is just a positive integer. In all, we have
about 350,000 accounts (this can probably grow to about 500,000 due to
acquisition). IDs from 1 to some MINID (1000 but may vary) are reserved, so
the WHERE includes only keys between MINID and whatever MAXID is at present
(350,000 now but may grow to 600,000 due to acquisition). Out of 200,000 to
300,000 observations, in average, 25-50% end up being deduped. If we could
cut CPU time even 10-15% coding a custom routine would be worth it. We run
SAS609 on mainframe.
Any fruitful ideas? Maybe somebody has already done something like that?
Al
|