Date: Tue, 12 Dec 2006 16:49:56 -0800
Reply-To: "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Subject: Re: which file format loads quickest?
Content-Type: text/plain; charset="us-ascii"
Hi Jared -
20000 variables is a very wide dataset. I'd look into normalizing the
data if possible, especially repetitious character fields of longer
I'd try to squeeze the data a bit, because I/O is a killer.
You can format repetitious character data with short alphanumeric keys
and restore the long strings with formats.
If you have categorical data stored as numbers such as integers 0-9,
which I'd guess might be true in your 2000 vars, storing them as single
byte character reduces space by 7/8ths in the SAS data, although I can't
say the impact on a transport file.
DDS Data Extraction
> Hello all,
> I am working in a Unix environment and I have a dataset which I would
> like to save with dimensions:
> ~20000 vars
> ~25000 obs
> logical record length of ~800000
> Saving this file as a V8 transport file means I can expect to wait
> about 11-12 minutes for it to load using proc cimport. Is there
> another format I can save this file in that would allow me to access
> with a quicker load time?