LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 1997, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 24 Oct 1997 19:10:00 -0400
Reply-To:   Dave_Mabey_at_RDA8POSTOFFICE1%RDCCMAIL%READERSDIGEST@READERSDIGEST.COM
Sender:   "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:   Dave_Mabey_at_RDA8POSTOFFICE1%RDCCMAIL%READERSDIGEST@READERSDIGEST.COM
Subject:   Re: how to improve the efficiency of this code?
Comments:   To: pokchan@WELLSFARGO.COM
Content-type:   text/plain; charset=US-ASCII

Dave Mabey at RDA8POSTOFFICE1@RDCCMAIL 10/24/97 07:10 PM

New Text Item: how to improve the efficiency of this code?

I think the attached log demonstrates one way of doing what you want. Instead of TIME, I used DATE which is a 4 character string for year/month between Jan 1,1992 and Dec 30,1997 (I enjoy not being Y2K compliant ... flame away). If TIME is numeric, you will use something like the following (untested) to generate the SELECT code.

SELECT 'WHEN ('||TIME||') LINE'||PUT(TIME,Z4.)||'=LINE'

Note: the SELECT in SQL has a very different meaning than SELECT in a DATASTEP!

Note: The WARNING probably should read that MACROVARIABLE STRINGS CAN BE AT THE MOST 32K CHARACTERS LONG. There can be no more that 200 characters between the semicolons; at least until we get version 7.

Note: If you run the following code, you may want to reduce the number of observations from 3 million to a few thousand. You will get better performance if TEST and NEW are not on the same disk pack.

Note: If you don't understand the SQL, check out the V6.12 changes and enhancement documents. You won't find SEPARATED BY documented in many other places. This code will not work on older versions of SAS.

124 data test; 125 do i=1 to 3000000; 126 date=put(int(uniform(1)*364*5+'1jan92'd),yymmn4.); 127 acc_id=int(uniform(2)*5000); 128 line=int(uniform(3)*10000); 129 output; 130 end; 131 run;

NOTE: The data set WORK.TEST has 3000000 observations and 4 variables. NOTE: DATA statement used: real time 1:12.160 cpu time 1:09.710

132 133 proc sql; 134 select 'when ('||quote(date)||') line'||date||'=line' 135 into :selcode separated by ';' 136 from 137 ( select distinct date 138 from test); WARNING: Concatenated strings can be at most 200 characters. NOTE: PROCEDURE SQL used: real time 7:03.360 cpu time 2:03.887

139 data new; 140 set test; 141 select (date); 142 &selcode; 143 otherwise; 144 end; 145 run;

NOTE: The data set WORK.NEW has 3000000 observations and 64 variables. NOTE: DATA statement used: real time 13:52.530 cpu time 1:44.043

______________________________ Reply Separator ____________________________ _____ Subject: how to improve the efficiency of this code? Author: pokchan@WELLSFARGO.COM@INTERNET at RDNOTES Date: 10/24/97 1:24 PM

(Embedded image moved to file: PIC001.PCX) Hi, SAS-Lers: I have a sas program and it works but it is extremely inefficient. Here is my code: ... proc print data=test(obs=5); title "before call macro"; run; %macro app1(x, b, id); data temp; acc_id = &id; line&x = &b; run; data all; set all temp; run; %mend app1; %macro useapp; data final; set test; /* test has over 3 million observation and variable time has about 65 distinct values); call execute('%try1('|| time ||','|| line ||','|| acc_id ||')'); run; %mend useapp; %useapp; data result; merge all final; by acc_id; run; proc print data=result(obs=5); title "after use macro"; run;

Here is the list file: before call macro TIME LINE ACC_ID 9605 5000 123 9704 15000 234 9605 23445 345 9612 12345 456 9002 23456 567 after call macro TIME LINE ACC_ID LINE9002 LINE9605 LINE9612 LINE9704 ....(additional 61 columns) 9605 5000 123 . 5000 . . 9704 15000 234 . . . 15000 9605 23445 345 . 23445 . . 9612 12345 456 . . 12345 . 9002 23456 567 23456 . . . Problem is: Macro app1 has been called over 3 million times and each time data set all has been opened and one record is appended to it. I tried to use array. Something like: Array line{} line0001 - line9704. However before program run, I don't know what is the upper bound, in this case, 9704. Of course, I can use proc means to produce max(time), and then use call symput to generate a macro variable. Then next question, how can I determine which variable to keep? In the example above, we should keep line9002, line9605, line9612, line9704, ... etc. Use proc sort to get distinct value of variable time, and then use call symput to generate 65 macro variables? I think there is limit of number of macro variables you can generate in one sas program (maybe 32?). Thank you for the help. SMTPOriginator: owner-sas-l@UGA.CC.UGA.EDU


Back to: Top of message | Previous page | Main SAS-L page