Date: Wed, 21 Jul 2010 07:05:30 -0400
Reply-To: Andy Arnold <awasas@COX.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Andy Arnold <awasas@COX.NET>
Subject: In search of a more efficient program
Background:
I've inherited a large, complex SAS program. Most files are quite small;
however, some are extremely large and have become a problem.
The files in question have 12-24 fields that are mostly numeric with a few
short (1-4character) fields. The files that are killing me have 20M records
and one has 160M records.
The files don't use SAS compression because the records are too short to
make compression cost & time effective.
Problem 1:
What are the trade-offs if I force numeric length to 4 or 2? Does SAS
always use a NUM 8 format internally? If I force short numeric field
lengths, will SAS have to convert them up to length=8 and back down again in
order to use the data?
Problem 2:
What are the trade-offs if I use an index instead of a sort? The problem
sort takes an hour, uses a single numeric key with about 10 distinct values,
sorts 20M records that are about 100 bytes long. I've been successful with
indexing before; in that situation, I needed to sort the file in 15
different sequences.
Thanks for your input and advice.
--Andy
|