|
Before you spend too much time sizing a 6M record dataset from which
you plan to select 24 records (!), you might want to consider
filtering the 6M first. By that I mean writing a WHERE clause that
will select data from the input buffer only if they do not meet
certain conditions. Unless you need summary values or comparisons
across records to determine whether or not to select a record, one
form of filtering or another should work. (In some instances it makes
sense to use views to preprocess records before they hit the WHERE
clause.)
SAS excels at the task of scanning very large system files and
selecting records. You do not have to read a system file into a SAS
dataset first for any operation that refers to data from only one
record at a time. (Also, in fixed-format input, you do not have to
read in all of the column variables either. It often works best to
read in a limited number of columns, compare across row values to
select a few record ID's, and then use the ID's to select a few
records from the full data file.) I have seen many acres of disk
space and many hours of CPU time wasted by programmers suffering from
misconceptions about SAS system file processing. (And I certainly
contributed some acres and hours to the totals before I discovered a
few basic truths about the SAS engine.)
You will get a lot of help from SAS-L if you can provide more details
about your programming task. Often the complexities of implementation
obscure simple methods of reducing a problem to an manageable size.
Sig
______________________________ Question _________________________________
Subject: Sizing Large SAS Data sets
Author: Bob Fitz <PW098@AOL.COM> at Internet-E-Mail
Date: 3/18/99 10:07 PM
Hi SAS-Ler's,
I have a variable length text file on MVS/ESA that will contain 6,000,000
records per month. I will have to keep 24 of them. The longest records are
1819 bytes (which contains 201 variables) and the other record lengths 114
bytes (with 14 variables). There is a 30% to 70% ratio of long records to
short records. How does one calculate the size of reading it into a SAS data
set vs the size of compressing the SAS data set vs leaving it as a text file?
Also how do you span multiple volumes and multiple tapes when you output. Any
advise, examples and documentation would be greatly appreciated.
Regards,
Bob Fitzgerald
|