Date: Fri, 30 Aug 2002 17:15:32 -0400
Reply-To: "Abakah Nori (crm1nxa)" <crm1nxa@UPS.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Abakah Nori (crm1nxa)" <crm1nxa@UPS.COM>
Subject: Re: Efficiency coding - Base
Content-Type: text/plain; charset="iso-8859-1"
Just want to say thanks for all the excellent responses. I will try out the
techniques provided to see which provides the most savings in CPU time.
=========================
Abakah, Nori
Capacity & Performance Analyst
United Parcel Service
(201) 828-2351
-----Original Message-----
From: diskin.dennis@kendle.com [mailto:diskin.dennis@kendle.com]
Sent: Friday, August 30, 2002 11:50 AM
To: Abakah Nori (crm1nxa)
Cc: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Efficiency coding - Base
Hi Nori,
One way to reduce the time would be to:
1. Make a view to define DATE and HOUR and specify the order. This lets the
system try to optimize the process.
2. Use datastep code to produce the sums. This requires some programming
skill and careflu checking.
A skeleton - syntax checked but not data validated. I suggest validateing
on test data and then a parallel run.:
data a;
array _in (*) IRESPTM TASCPUTM TDTOTCN TSTOTCN FCTOTCN SUSPNDTM TASDSPTM
WTDISPTM WTTCIOTM;
do i = 1 to 10000;
applid=1980+int(100*uniform(0));
tranname='abcdf';
tasknr=int(100*uniform(0));
uowtime='12:23't + int(uniform(0)*10);
system='CICS';
TCLASS='1';
do j = 1 to dim(_in);
_in(j) = i+j;
end;
strttime=123456789;
program=' ';
if mod(_in(1),100) = 1 then program='########';
output;
end;
proc sql;
create view dated
as select *,datepart(strttime) as date,hour(timepart(strttime)) as hour
from a
where program ne '########'
order BY APPLID,TRANNAME,TASKNR,UOWTIME,DATE,HOUR,system;
;
Data lparin(keep=APPLID TRANNAME TASKNR UOWTIME DATE HOUR SYSTEM
TCLASS IRESPTM TASCPUTM TDTOTCN TSTOTCN
FCTOTCN SUSPNDTM
TASDSPTM WTDISPTM WTTCIOTM);
array _in (*) IRESPTM TASCPUTM TDTOTCN TSTOTCN FCTOTCN SUSPNDTM TASDSPTM
WTDISPTM WTTCIOTM;
array _sum (9) _TEMPORARY_;
set dated;
/* I assume SYSTEM is a never a problem even though you don't sort on it */
BY APPLID TRANNAME TASKNR UOWTIME date hour;
do _i = 1 to dim(_in);
_sum(_i) + _in(_i);
end;
if last.hour;
do _i = 1 to dim(_in);
_in(_i) = _sum(_i);
end;
output;
do _i = 1 to dim(_in);
_sum(_i) =.;
end;
run;
From: "Abakah Nori (crm1nxa)" <crm1nxa@UPS.COM>@LISTSERV.UGA.EDU> on
08/30/2002 09:24 AM
Please respond to "Abakah Nori (crm1nxa)" <crm1nxa@UPS.COM>
Sent by: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
To: SAS-L@LISTSERV.UGA.EDU
cc:
Subject: Efficiency coding - Base
Hi all,
I need some ideas on how to get this code to be more efficient. Once upon a
time, the input dataset contained < 1,000,000 obs so efficiency may not
have been top priority. Now, with > 14,000,000 obs, CPU time has sky
rocketed.
Any ideas will be greatly appreciated.
Thanks
Nori
13 DATA LPARIN;
14 SET LPARIN.CICSTRAN;
THE SAS SYSTEM
15 KEEP APPLID TRANNAME IRESPTM SYSTEM TASCPUTM TDTOTCN
TSTOTCN
16 FCTOTCN SUSPNDTM TASDSPTM WTDISPTM WTTCIOTM UOWTIME
TASKNR
17 TCLASS HOUR DATE;
18 HOUR=HOUR(TIMEPART(STRTTIME));
19 DATE=DATEPART(STRTTIME);
20 IF PROGRAM='########' THEN DELETE;
21
NOTE: THERE WERE 14778806 OBSERVATIONS READ FROM THE DATA SET
LPARIN.CICSTRAN.
NOTE: THE DATA SET WORK.LPARIN HAS 14774629 OBSERVATIONS AND 17 VARIABLES.
NOTE: COMPRESSING DATA SET WORK.LPARIN DECREASED SIZE BY 1.99 PERCENT.
COMPRESSED IS 42591 PAGES; UN-COMPRESSED WOULD REQUIRE 43455 PAGES.
NOTE: THE DATA STATEMENT USED 387.00 CPU SECONDS AND 11071K.
22 PROC SORT;
23 BY APPLID TRANNAME TASKNR UOWTIME DATE HOUR;
24
NOTE: THERE WERE 14774629 OBSERVATIONS READ FROM THE DATA SET WORK.LPARIN.
NOTE: THE DATA SET WORK.LPARIN HAS 14774629 OBSERVATIONS AND 17 VARIABLES.
NOTE: COMPRESSING DATA SET WORK.LPARIN DECREASED SIZE BY 1.99 PERCENT.
COMPRESSED IS 42589 PAGES; UN-COMPRESSED WOULD REQUIRE 43455 PAGES.
NOTE: THE PROCEDURE SORT USED 220.58 CPU SECONDS AND 11163K.
25 PROC MEANS NOPRINT;
26 BY APPLID TRANNAME TASKNR UOWTIME DATE HOUR SYSTEM;
27 ID TCLASS;
28 OUTPUT OUT=LPARIN
29 SUM=IRESPTM TASCPUTM TDTOTCN TSTOTCN FCTOTCN
30 SUSPNDTM TASDSPTM WTDISPTM WTTCIOTM;
31 VAR IRESPTM TASCPUTM TDTOTCN TSTOTCN FCTOTCN
32 SUSPNDTM TASDSPTM WTDISPTM WTTCIOTM;
NOTE: THERE WERE 14774629 OBSERVATIONS READ FROM THE DATA SET WORK.LPARIN.
NOTE: THE DATA SET WORK.LPARIN HAS 14159741 OBSERVATIONS AND 19 VARIABLES.
NOTE: COMPRESSING DATA SET WORK.LPARIN DECREASED SIZE BY 32.10 PERCENT.
COMPRESSED IS 46448 PAGES; UN-COMPRESSED WOULD REQUIRE 68405 PAGES.
NOTE: THE PROCEDURE MEANS USED 899.87 CPU SECONDS AND 11499K.
=========================
Abakah, Nori
Capacity & Performance Analyst
United Parcel Service
(201) 828-2351