LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2003, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 10 May 2003 01:10:59 -0600
Reply-To:     Jack Hamilton <jackhamilton@FIRSTHEALTH.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Jack Hamilton <jackhamilton@FIRSTHEALTH.COM>
Subject:      Re: maximum number of variables in SAS
Comments: To: stringplayer_2@YAHOO.COM
Content-Type: text/plain; charset=us-ascii

Dale,

You're just not going about it right:

===== 3 data x.test (keep=v000001-v300000 compress=char); 4 5 retain v000001-v300000 ' '; 6 7 run;

NOTE: The data set X.TEST has 1 observations and 300000 variables. NOTE: Compressing data set X.TEST decreased size by 99.81 percent. NOTE: DATA statement used (Total process time): real time 1:41.25 cpu time 51.71 seconds =====

I hope 300,000 variables is enough. Pretty good compression on this baby, too.

The key is in the definition of the library:

===== 1 libname x spde 'c:\temp'; NOTE: The SPDE Engine is pre-production for SAS V9.0. NOTE: Libref X was successfully assigned as follows: Engine: SPDE Physical Name: hamiltja:1696c:\temp\ =====

The V9 SPDE engine lets you have many more variables. It also lets you use multiple indexes at once. There are some disadvantages. I don't think you can use constraints or create audit data sets. It also doesn't appear that it will be speedy, but that might be a result of the large number of variables in the data set and small amount of memory on my machine, rather than something inherent in the SPDE engine.

-- JackHamilton@FirstHealth.com Manager, Technical Development Metrics Department, First Health West Sacramento, California USA

>>> "Dale McLerran" <stringplayer_2@YAHOO.COM> 05/09/03 10:10PM >>> --- John Whittington <John.W@MEDISCIENCE.CO.UK> wrote: > > Dale, although others seem to be saying that the limit is still > present in > v9, as I've just written in another message, that limit applies only > to the > number of variables that can be stored in a DATASET; with v8 (and I > believe also v6) one can initialise (and process) an essentially > unlimited > number of variables within a DATA step, so long as one does not > attempt to > write put more than 32K of them into a dataset. Using SAS 8.2: > > 264 data a (keep = a1-a32767) ; > 265 array a(100000) $1 ; > 266 run ; > NOTE: The data set WORK.A has 1 observations and 32767 variables. > > Kind Regards, > > John >

John,

Indeed, I have employed datasteps in which I use more than 32K variables. However, I have some applications in which I need to transpose a dataset which has tens of thousands of rows. The rows of the original dataset are effectively candidate variables for a machine learning problem.

However, it would appear that I will have to split the data into subsets before transposing. As noted by Kevin Delaney and Ed Heaton, V9 on a Windows platform does not deliver the ability to write a dataset with more than 32K variables. Until two or three months ago, I would never have thought that I would need the ability to write a dataset with so many variables. Ah, well, I have developed some workarounds already. I'll just have to stick, for now, with chunking out my data.

Dale

===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

__________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo. http://search.yahoo.com


Back to: Top of message | Previous page | Main SAS-L page