Date: Sat, 10 May 2003 01:10:59 -0600
Reply-To: Jack Hamilton <jackhamilton@FIRSTHEALTH.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jack Hamilton <jackhamilton@FIRSTHEALTH.COM>
Subject: Re: maximum number of variables in SAS
Content-Type: text/plain; charset=us-ascii
Dale,
You're just not going about it right:
=====
3 data x.test (keep=v000001-v300000 compress=char);
4
5 retain v000001-v300000 ' ';
6
7 run;
NOTE: The data set X.TEST has 1 observations and 300000 variables.
NOTE: Compressing data set X.TEST decreased size by 99.81 percent.
NOTE: DATA statement used (Total process time):
real time 1:41.25
cpu time 51.71 seconds
=====
I hope 300,000 variables is enough. Pretty good compression on this
baby, too.
The key is in the definition of the library:
=====
1 libname x spde 'c:\temp';
NOTE: The SPDE Engine is pre-production for SAS V9.0.
NOTE: Libref X was successfully assigned as follows:
Engine: SPDE
Physical Name: hamiltja:1696c:\temp\
=====
The V9 SPDE engine lets you have many more variables. It also lets you
use multiple indexes at once. There are some disadvantages. I don't
think you can use constraints or create audit data sets. It also
doesn't appear that it will be speedy, but that might be a result of the
large number of variables in the data set and small amount of memory on
my machine, rather than something inherent in the SPDE engine.
--
JackHamilton@FirstHealth.com
Manager, Technical Development
Metrics Department, First Health
West Sacramento, California USA
>>> "Dale McLerran" <stringplayer_2@YAHOO.COM> 05/09/03 10:10PM >>>
--- John Whittington <John.W@MEDISCIENCE.CO.UK> wrote:
>
> Dale, although others seem to be saying that the limit is still
> present in
> v9, as I've just written in another message, that limit applies only
> to the
> number of variables that can be stored in a DATASET; with v8 (and I
> believe also v6) one can initialise (and process) an essentially
> unlimited
> number of variables within a DATA step, so long as one does not
> attempt to
> write put more than 32K of them into a dataset. Using SAS 8.2:
>
> 264 data a (keep = a1-a32767) ;
> 265 array a(100000) $1 ;
> 266 run ;
> NOTE: The data set WORK.A has 1 observations and 32767 variables.
>
> Kind Regards,
>
> John
>
John,
Indeed, I have employed datasteps in which I use more than 32K
variables. However, I have some applications in which I need
to transpose a dataset which has tens of thousands of rows.
The rows of the original dataset are effectively candidate
variables for a machine learning problem.
However, it would appear that I will have to split the data
into subsets before transposing. As noted by Kevin Delaney
and Ed Heaton, V9 on a Windows platform does not deliver the
ability to write a dataset with more than 32K variables. Until
two or three months ago, I would never have thought that I
would need the ability to write a dataset with so many variables.
Ah, well, I have developed some workarounds already. I'll just
have to stick, for now, with chunking out my data.
Dale
=====
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@fhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com