|
Dave Mabey at RDA8POSTOFFICE1@RDCCMAIL
10/24/97 07:10 PM
New Text Item: how to improve the efficiency of this code?
I think the attached log demonstrates one way of doing what you want.
Instead of TIME, I used DATE which is a 4 character string for
year/month between Jan 1,1992 and Dec 30,1997 (I enjoy not being Y2K
compliant ... flame away). If TIME is numeric, you will use something
like the following (untested) to generate the SELECT code.
SELECT 'WHEN ('||TIME||') LINE'||PUT(TIME,Z4.)||'=LINE'
Note: the SELECT in SQL has a very different meaning than SELECT in a
DATASTEP!
Note: The WARNING probably should read that MACROVARIABLE STRINGS CAN BE
AT THE MOST 32K CHARACTERS LONG. There can be no more that 200
characters between the semicolons; at least until we get version 7.
Note: If you run the following code, you may want to reduce the number
of observations from 3 million to a few thousand. You will get better
performance if TEST and NEW are not on the same disk pack.
Note: If you don't understand the SQL, check out the V6.12 changes and
enhancement documents. You won't find SEPARATED BY documented in many
other places. This code will not work on older versions of SAS.
124 data test;
125 do i=1 to 3000000;
126 date=put(int(uniform(1)*364*5+'1jan92'd),yymmn4.);
127 acc_id=int(uniform(2)*5000);
128 line=int(uniform(3)*10000);
129 output;
130 end;
131 run;
NOTE: The data set WORK.TEST has 3000000 observations and 4 variables.
NOTE: DATA statement used:
real time 1:12.160
cpu time 1:09.710
132
133 proc sql;
134 select 'when ('||quote(date)||') line'||date||'=line'
135 into :selcode separated by ';'
136 from
137 ( select distinct date
138 from test);
WARNING: Concatenated strings can be at most 200 characters.
NOTE: PROCEDURE SQL used:
real time 7:03.360
cpu time 2:03.887
139 data new;
140 set test;
141 select (date);
142 &selcode;
143 otherwise;
144 end;
145 run;
NOTE: The data set WORK.NEW has 3000000 observations and 64 variables.
NOTE: DATA statement used:
real time 13:52.530
cpu time 1:44.043
______________________________ Reply Separator ____________________________
_____
Subject: how to improve the efficiency of this code?
Author: pokchan@WELLSFARGO.COM@INTERNET at RDNOTES
Date: 10/24/97 1:24 PM
(Embedded image moved to file: PIC001.PCX)
Hi, SAS-Lers:
I have a sas program and it works but it is extremely inefficient. Here is
my code:
...
proc print data=test(obs=5);
title "before call macro";
run;
%macro app1(x, b, id);
data temp;
acc_id = &id;
line&x = &b;
run;
data all;
set all
temp;
run;
%mend app1;
%macro useapp;
data final;
set test; /* test has over 3 million observation and variable
time has about 65 distinct values);
call execute('%try1('|| time ||','|| line ||','|| acc_id ||')');
run;
%mend useapp;
%useapp;
data result;
merge all
final;
by acc_id;
run;
proc print data=result(obs=5);
title "after use macro";
run;
Here is the list file:
before call macro
TIME LINE ACC_ID
9605 5000 123
9704 15000 234
9605 23445 345
9612 12345 456
9002 23456 567
after call macro
TIME LINE ACC_ID LINE9002 LINE9605
LINE9612 LINE9704 ....(additional 61 columns)
9605 5000 123 . 5000
. .
9704 15000 234 .
. . 15000
9605 23445 345 . 23445
. .
9612 12345 456 .
. 12345 .
9002 23456 567 23456
. . .
Problem is:
Macro app1 has been called over 3 million times and each time data set all
has been opened and one record is appended to it.
I tried to use array. Something like:
Array line{} line0001 - line9704.
However before program run, I don't know what is the upper bound, in
this case, 9704. Of course, I can use proc means to produce max(time), and
then use call symput to generate a macro variable. Then next question, how
can I determine which variable to keep?
In the example above, we should keep line9002, line9605, line9612,
line9704, ... etc. Use proc sort to get distinct value of variable time,
and then use call symput to generate 65 macro variables? I think there is
limit of number of macro variables you can generate in one sas program
(maybe 32?).
Thank you for the help.
SMTPOriginator: owner-sas-l@UGA.CC.UGA.EDU
|