LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2008, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 26 Mar 2008 21:22:17 -0500
Reply-To:     "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>
Subject:      Re: Get out of memory in sas-macro
Comments: To: Stefan Pohl <stefan.pohl@ISH.DE>
Content-Type: text/plain; charset="iso-8859-15"

Stefan Pohl wrote: > Hi sas-list, > > With the sas-macro %clusterexact (printed below) I get all m = n > choose k combinations in a special output, > a cluster notation. > > %clusterexact(9,5,126) works fine. Here I get > > idsrange > [1 10] > [1 1] [2 2] [3 3] [4 4] [5 5] [6 10] > ... > > %clusterexact(79,5,22537515) works fine, too. > > But %clusterexact(98,5,67910864) produces out of memory. I know > 67.910.864 is a very big number but SAS should handle > this. Is there another way to store the combinations in idsrange to > avoid the out of memory error? How should I modify my sas-macro? > > Is it impossible to run with success in a short future > %clusterexact(3000,5,2.0183E15)??? > > Thanks for your help, Stefan.

Stefan,

Interesting, but what the heck are you going to do with the 98C5 or 3000C5 combinations ?

98C5 yields 6.8E7 rows, and needs a 38 byte representation template [1 ..] [.. ..] [.. ..] [.. ..] [.. ..] meaning the tablesize is around 2.58Gb (if 1Gb = 1e9 bytes)

3000C5 yields 2.0183E15 rows, and needs a 56 byte representation template [1 ....] [.... ....] [.... ....] [.... ....] [.... ....] meaning the tablesize is around 113,022,440.8Gb (if 1Gb = 1e9 bytes), or 113.02Pb (Petabytes [1e15b]). This is probably outside the reach of your system resources.

According to http://en.wikipedia.org/wiki/Petabyte, Google processes over 20Pb of data day.

Here is a macro to generate the combinations. Probably more memory efficient than PLAN since it is doing a specific singular task. The 5 group [] representation of the combinations would have to be recoded... But why do you need the [] representation in the first place? Can't you work with the combination array directly ? 3000C5 rows stored as 5 4-byte values would still consume >40.36Pb disk space.

------------------------------ %macro nested_loop_combos (data=, n=, m=);

%local ncombos interval;

%let ncombos = %sysfunc(comb(&n,&m)); %let interval = %sysevalf (&ncombos/20);

%put ncombos=&ncombos;

%if %length(&data) = 0 %then %let data=combos_&n._&m.;

%local i;

data &data;

_n_ = 0; target = &interval;

x0 = 0; %do i = 1 %to &m; do x&i = x%eval(&i-1)+1 to &n; %end;

_n_ + 1; if _n_ > target then do; put _n_=; target + &interval; end;

output;

%do i = 1 %to &m; end; %end;

stop;

drop x0; format x: 4.; run;

%mend;

options mprint; /* %nested_loop_combos(n=9,m=5); %nested_loop_combos(n=79,m=5); %nested_loop_combos(n=98,m=5); */

data _null_; size = comb(98,5) * 38 / 1e9; put size=;

size = comb(3000,5) * 56 / 1e9; put size=;

size = comb(3000,5) * 56 / 1e15; put size=; run; ------------------------------

-- Richard A. DeVenezia


Back to: Top of message | Previous page | Main SAS-L page