LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2000, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 5 May 2000 21:40:18 GMT
Reply-To:     sashole@mediaone.net
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Paul Dorfman <paul_dorfman@HOTMAIL.COM>
Subject:      Re: transpose?
Comments: To: Steve_Giblin@DOM.COM
Content-Type: text/plain; format=flowed

Steve,

Great job, thanks. I tend to agree with your conclusions, taking a small exception to the difficulty of putting a bunch of arrays together. It seems to be an easy prey for call execute(). Below, it assembles a number of implicit arrays which are then looped through. If there are character variables in the dataset, the _character_ array and a small macro %dolc() analogous to %doln() can be added.

proc sql noprint; select max(c) into : m from (select count(*) as c from a group by id); run; %let x = call execute; data _null_; set a; array nn _numeric_; length v $8; %macro doln (ln); do _i_=2 to dim(nn); call vname(nn,v); &x (&ln); end; %mend doln; &x ("data b;"); %doln ("array nn" || left(put(_i_-1,best.)) || trim(v) || "1-" || trim(v) || left("&m;")); &x ("array nn (_n_) nn1-nn" || left(put(_i_-2,best.))||";"); &x ("array v "); %doln (v); &x (";"); &x ("do _i_=1 by 1 until(last.id); set a; by id;"); &x ("do over nn; nn = v; end; end; run;"); stop; run;

Working against the original input

data a; input id age ctgy type score ; cards; 1 36 1 2 1 1 42 2 1 2 1 57 3 1 3 2 16 4 3 4 2 21 5 3 5 ; run;

the step assembles and submits the following code:

data b; array nn1 AGE1-AGE3; array nn2 CTGY1-CTGY3; array nn3 TYPE1-TYPE3; array nn4 SCORE1-SCORE3; array nn (_n_) nn1-nn4; array v AGE CTGY TYPE SCORE; do _i_=1 by 1 until(last.id); set a; by id; do over nn; nn = v; end; end; run;

I hope its idea is clear. Of course, the same can be done your way, i.e. by writing into a temp file and %including it. The point is, neither does take a lot of time to concoct, so much better performance does not come at too steep a price.

Kind regards, ======================= Paul M. Dorfman Jacksonville, FL =======================

>From: Steve_Giblin@DOM.COM >Reply-To: Steve_Giblin@DOM.COM >To: SAS-L@LISTSERV.UGA.EDU >Subject: Re: transpose? >Date: Thu, 4 May 2000 12:09:16 -0400 > >In continuance of this thread, I wish >to add a few other notes: > >Max's post indicated that his set contain 37 >variables, not just the age and type. > >The two methods that came to mind are RETAIN >and TRANSPOSE. > >Because of the 37 variables, this also could >involve macro / do loops. > >I took the time to enlarge the test set, >add more variables, and do some efficiency tests - >comparing TRANSPOSE with RETAIN. > >I had initially constructed the RETAIN method that >uses a counter, etc. and then I saw Paul Dorfman's >very clear and elegant post. > >It is still worthwile to compare the efficiency >of the TRANSPOSE with the now 2 types of RETAIN >methods - all with the 37 variables. > >I have assumed that all 37 are numeric variables. > >First, here are the results in run time for each type. >(Used with DATA _NULL_ and output the time to a file). > > >(Results with PC SAS - 6.12 - Windows ) > >begin transpose method 10:24:36 >end transpose method 10:28:30 234 seconds >-------------------------------- >(mine with count) >begin retain method 10:28:30 >end retain method 10:28:46 16 seconds >-------------------------------- >(Paul Dorfman's) >begin alt retain method 10:28:46 >end retain method 10:29:01 15 seconds > > > >Thus, the RETAIN method is 15 times faster, >let alone the extra disk space usage, >because you do not have to create 37 separate >data sets as with TRANSPOSE. > >Attached is the code for 2 cases: >( I have only attached Paul's method > for the RETAIN case > because of email size constraints ) > >A separate macro was created for each method: >--------------------------------------------- >%transpos - TRANSPOSE method >%retains - my RETAIN method with counter,etc >%altretn - Paul Dorfman's RETAIN method. > > > >Final note: > > The TRANSPOSE method took 1/10th the time > to code than the RETAIN methods, but you > can see where there is no need to have to > type out all 37 variables with their > associated renames, computations, and drops. > > The TRANSPOSE method was cleaner coding for > all 37 variables - %DO i=1 %TO 37 versus > the tricky way that stright DO's are used > in the DATA _NULL_ step to generate %INCLUDEable > SAS code. > > It depends if the code is for a "production" > type of environment or is a one-time ad-hoc. > >Steve > > > > >* This is the test case Max provided ; >* with a few extra cases and then ; >* a blow-up to be 9,000 * 4 = 36,000 ids; > > >data temp; >input x v1 v2; >array v (37) v1-v37; >do i=3 to 37; v(i)=v2; end; >drop i; >cards; >.1 36 2 >.1 42 1 >.1 57 1 >.2 16 3 >.2 21 3 >.3 8 2 >.3 12 2 >.3 32 3 >.3 35 1 >.3 36 1 >.4 18 2 >.4 40 1 >; > > >data temp; > set temp; > do ref=1 to 9000; id=ref+x; output; end; > drop x ref; > proc sort; by id; > run; > > data _null_; > file 'd:\check.sas'; > x=time(); format x time16.; > put @1 'begin transpose method' @25 x ; > run; > >%macro transpos; > > %do i=1 %to 37; > > proc transpose data=temp > out=temp&i (drop=_name_) > prefix=v&i.r; > by id; > var v&i; > %end; > > * now let sas create the merge statements; > > data _null_; > file 'd:\temp.sas'; > put 'data result;'; > put ' merge '; > do i=1 to 37; > ref=left(put(i,2.)); > put @5 'temp' ref; > end; > put ';'; > put ' by id;'; > put ' run;'; > run; > > %include 'd:\temp.sas'; > >%mend transpos; >%transpos; >run; > > data _null_; > file 'd:\check.sas' mod; > x=time(); format x time16.; > put @1 'end transpose method' @25 x ; > run; > > data _null_; > file 'd:\check.sas' mod; > x=time(); format x time16.; > put @1 'begin alt retain method' @25 x ; > run; > > >%macro altretn; > >* this establishes the max records per id; > >data _null_; > set temp end=lastrec; > by id; > retain cnt maxcnt; > if first.id=1 then cnt=.; > cnt=sum(cnt,1); > if last.id=1 then if cnt > maxcnt then maxcnt=cnt; > if lastrec=1 then do; > call symput('maxid',maxcnt); > end; > run; > > > > * Now you sas to create sas code to be %included; > >data _null_; > file 'd:\temp.sas'; > length ref $100; > > put 'data resultc;'; > do i=1 to 37; > ref = 'array v'||trim(left(put(i,2.)))||'r'; > put @5 ref '(&maxid);'; > end; > put ' '; > > > put ' do _n_=1 by 1 until (last.id); '; > put ' set temp (rename=( '; > do i=1 to 37; > ref=left(put(i,2.)); > put @15 'v' ref ' = ' 'w' ref; > end; > put ' ));'; > put ' by id;'; > put ' '; > > do i=1 to 37; > ref = 'v'||trim(left(put(i,2.)))|| > 'r(_n_) = w'|| > trim(left(put(i,2.)))||';'; > put @9 ref; > end; > put 'end;'; > put ' '; > > put 'drop w1-w37 ;'; > put 'run;'; > run; > > > %include 'd:\temp.sas'; > run; > >%mend altretn; >%altretn; > > > data _null_; > file 'd:\check.sas' mod; > x=time(); format x time16.; > put @1 'end retain method' @25 x ; > run;

________________________________________________________________________ Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com


Back to: Top of message | Previous page | Main SAS-L page