Date: Wed, 26 Mar 2008 21:22:17 -0500
Reply-To: "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>
Subject: Re: Get out of memory in sas-macro
Content-Type: text/plain; charset="iso-8859-15"
Stefan Pohl wrote:
> Hi sas-list,
>
> With the sas-macro %clusterexact (printed below) I get all m = n
> choose k combinations in a special output,
> a cluster notation.
>
> %clusterexact(9,5,126) works fine. Here I get
>
> idsrange
> [1 10]
> [1 1] [2 2] [3 3] [4 4] [5 5] [6 10]
> ...
>
> %clusterexact(79,5,22537515) works fine, too.
>
> But %clusterexact(98,5,67910864) produces out of memory. I know
> 67.910.864 is a very big number but SAS should handle
> this. Is there another way to store the combinations in idsrange to
> avoid the out of memory error? How should I modify my sas-macro?
>
> Is it impossible to run with success in a short future
> %clusterexact(3000,5,2.0183E15)???
>
> Thanks for your help, Stefan.
Stefan,
Interesting, but what the heck are you going to do with the 98C5 or 3000C5
combinations ?
98C5 yields 6.8E7 rows, and needs a 38 byte representation template
[1 ..] [.. ..] [.. ..] [.. ..] [.. ..]
meaning the tablesize is around 2.58Gb (if 1Gb = 1e9 bytes)
3000C5 yields 2.0183E15 rows, and needs a 56 byte representation template
[1 ....] [.... ....] [.... ....] [.... ....] [.... ....]
meaning the tablesize is around 113,022,440.8Gb (if 1Gb = 1e9 bytes), or
113.02Pb (Petabytes [1e15b]). This is probably outside the reach of your
system resources.
According to http://en.wikipedia.org/wiki/Petabyte, Google processes over
20Pb of data day.
Here is a macro to generate the combinations. Probably more memory
efficient than PLAN since it is doing a specific singular task.
The 5 group [] representation of the combinations would have to be
recoded... But why do you need the [] representation in the first place?
Can't you work with the combination array directly ? 3000C5 rows stored as
5 4-byte values would still consume >40.36Pb disk space.
------------------------------
%macro nested_loop_combos (data=, n=, m=);
%local ncombos interval;
%let ncombos = %sysfunc(comb(&n,&m));
%let interval = %sysevalf (&ncombos/20);
%put ncombos=&ncombos;
%if %length(&data) = 0 %then
%let data=combos_&n._&m.;
%local i;
data &data;
_n_ = 0;
target = &interval;
x0 = 0;
%do i = 1 %to &m;
do x&i = x%eval(&i-1)+1 to &n;
%end;
_n_ + 1;
if _n_ > target then do;
put _n_=;
target + &interval;
end;
output;
%do i = 1 %to &m;
end;
%end;
stop;
drop x0;
format x: 4.;
run;
%mend;
options mprint;
/*
%nested_loop_combos(n=9,m=5);
%nested_loop_combos(n=79,m=5);
%nested_loop_combos(n=98,m=5);
*/
data _null_;
size = comb(98,5) * 38 / 1e9;
put size=;
size = comb(3000,5) * 56 / 1e9;
put size=;
size = comb(3000,5) * 56 / 1e15;
put size=;
run;
------------------------------
--
Richard A. DeVenezia