Date: Sat, 10 May 2003 01:10:59 -0600
Reply-To: Jack Hamilton <jackhamilton@FIRSTHEALTH.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jack Hamilton <jackhamilton@FIRSTHEALTH.COM>
Subject: Re: maximum number of variables in SAS
Content-Type: text/plain; charset=us-ascii
You're just not going about it right:
3 data x.test (keep=v000001-v300000 compress=char);
5 retain v000001-v300000 ' ';
NOTE: The data set X.TEST has 1 observations and 300000 variables.
NOTE: Compressing data set X.TEST decreased size by 99.81 percent.
NOTE: DATA statement used (Total process time):
real time 1:41.25
cpu time 51.71 seconds
I hope 300,000 variables is enough. Pretty good compression on this
The key is in the definition of the library:
1 libname x spde 'c:\temp';
NOTE: The SPDE Engine is pre-production for SAS V9.0.
NOTE: Libref X was successfully assigned as follows:
Physical Name: hamiltja:1696c:\temp\
The V9 SPDE engine lets you have many more variables. It also lets you
use multiple indexes at once. There are some disadvantages. I don't
think you can use constraints or create audit data sets. It also
doesn't appear that it will be speedy, but that might be a result of the
large number of variables in the data set and small amount of memory on
my machine, rather than something inherent in the SPDE engine.
Manager, Technical Development
Metrics Department, First Health
West Sacramento, California USA
>>> "Dale McLerran" <stringplayer_2@YAHOO.COM> 05/09/03 10:10PM >>>
--- John Whittington <John.W@MEDISCIENCE.CO.UK> wrote:
> Dale, although others seem to be saying that the limit is still
> present in
> v9, as I've just written in another message, that limit applies only
> to the
> number of variables that can be stored in a DATASET; with v8 (and I
> believe also v6) one can initialise (and process) an essentially
> number of variables within a DATA step, so long as one does not
> attempt to
> write put more than 32K of them into a dataset. Using SAS 8.2:
> 264 data a (keep = a1-a32767) ;
> 265 array a(100000) $1 ;
> 266 run ;
> NOTE: The data set WORK.A has 1 observations and 32767 variables.
> Kind Regards,
Indeed, I have employed datasteps in which I use more than 32K
variables. However, I have some applications in which I need
to transpose a dataset which has tens of thousands of rows.
The rows of the original dataset are effectively candidate
variables for a machine learning problem.
However, it would appear that I will have to split the data
into subsets before transposing. As noted by Kevin Delaney
and Ed Heaton, V9 on a Windows platform does not deliver the
ability to write a dataset with more than 32K variables. Until
two or three months ago, I would never have thought that I
would need the ability to write a dataset with so many variables.
Ah, well, I have developed some workarounds already. I'll just
have to stick, for now, with chunking out my data.
Fred Hutchinson Cancer Research Center
Ph: (206) 667-2926
Fax: (206) 667-5977
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.