Date: Mon, 28 Feb 2005 09:45:11 -0500
Reply-To: Larry Bertolini <bertolini.1@OSU.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Larry Bertolini <bertolini.1@OSU.EDU>
Organization: Ohio State University
Subject: Re: Creating dynamic Mainframe dataset using libname
Content-Type: text/plain; charset=us-ascii; format=flowed
Michael Raithel wrote:
mainframe block size of 6144. That block size is a holdover
> from the Cold War. Additionally, it is very wasteful in terms of DASD
> track real estate and the number of additional I/O's it takes to haul
> your data from DASD to the computer memory and back again. You should
> seriously consider using the half-track block size of 27648, instead.
>
I agree, half-track LRECL and BLKSIZE are preferred in almost all
situations. It is more space-efficient in all cases, and is certainly
more CPU and I/O efficient for sequential access to datasets (I'd guess
that somewhere between 95% and 99.5% of all SAS code that gets executed
is basically sequential in nature).
But I wonder if the smaller blocksize might not be more efficient
when performing random access to a dataset; e.g., using POINT=,
or KEY= on an indexed dataset. Why move around 28K of data per I/O,
if you're only after a single, 1K observation, and if the odds of
needing another observation from the same block are very low?
I haven't benchmarked this scenario, but I'd expect random access
to an indexed SAS dataset, with KEY=foo / UNIQUE, to behave similarly
to random access to a VSAM KSDS, where a moderate-sized block
(control interval, in VSAM-ese) of 4K is typical.
(I suspect that if you can use SASFILE to keep the entire SAS
dataset in cache, the blocksize, as it relates to random I/O
performance, is probably irrelevant.)
|