Date: Sat, 28 Jul 2007 22:01:39 -0400 "Howard Schreier " "SAS(r) Discussion" "Howard Schreier " Re: complex datastep logic

On Thu, 26 Jul 2007 06:09:00 -0400, Arild S <sko@KLP.NO> wrote:

>On Thu, 26 Jul 2007 08:24:01 -0000, vijayakumar <vijayk.kannan@GMAIL.COM> >wrote: > >>Dear list, >> >>I have raw dataset as shown below format >> >>Id visit lvl aa bb cc dd ee >>111 1 L1 1 1 1 . . >>111 2 L1 . 1 0 1 1 >>111 3 L1 1 . . . 1 >>111 1 L2 1 1 . . . >>111 2 L2 . . . . 1 >>111 3 L2 1 1 1 1 1 >>112 1 L4 1 1 1 1 1 >>112 2 L4 . 1 0 1 1 >>112 3 L4 1 . . . . >>112 1 S1 1 1 1 1 . >>112 2 S1 . . . . 1 >>112 3 S1 1 1 1 1 1 >> >> >>I want to create the final dataset as follows; >> > <cut..> >> >>At visit=1 for id 111 we have to compare "aa" value at lvl=L1 with >>lvl=L2 if both are 1 then o_aa=1 else o_aa=. , I have tried with proc >>transpose to do logic , if any have idea / provide code example means >>that will be great . >> >> >>Thanks, >>Vijay > >The *value* of LVL doesn't matter, does it? Can you simply compare >visits against each other and see if they are all "1" ? >I simply converted missing values to "0" to use min() function, and if min() >=1, then they are all 1, simple as that (given none are >1). > >data have; >input (Id visit lvl)(\$) aa bb cc dd ee; >cards; >111 1 L1 1 1 1 . . >111 2 L1 . 1 0 1 1 >111 3 L1 1 . . . 1 >111 1 L2 1 1 . . . >111 2 L2 . . . . 1 >111 3 L2 1 1 1 1 1 >112 1 L4 1 1 1 1 1 >112 2 L4 . 1 0 1 1 >112 3 L4 1 . . . . >112 1 S1 1 1 1 1 . >112 2 S1 . . . . 1 >112 3 S1 1 1 1 1 1 >; >run; > >proc sql; > create table want as > select bb.*, lvl, aa, bb, cc, dd, ee from > ( > (select * from have) aa > inner join > (select id, visit, > min(case when aa=1 then aa else 0 end) as o_aa, > min(case when bb=1 then bb else 0 end) as o_bb, > min(case when cc=1 then cc else 0 end) as o_cc, > min(case when dd=1 then dd else 0 end) as o_dd, > min(case when ee=1 then ee else 0 end) as o_ee > from have > group by id, visit) bb > on aa.id = bb.id and aa.visit = bb.visit > ) > ; >quit; > >proc print data=want width=minimum; > var id visit lvl aa--ee o_aa--o_ee; >run; > >Rgds,Arild

Here's a somewhat different SQL solution:

create table want as select * from ( select id , visit , case when sum(aa)=2 then 1 else . end as o_aa , case when sum(bb)=2 then 1 else . end as o_bb , case when sum(cc)=2 then 1 else . end as o_cc , case when sum(dd)=2 then 1 else . end as o_dd , case when sum(ee)=2 then 1 else . end as o_ee from have group by id, visit ) natural join have ;

This generates missing values, as requested, rather than zeroes.

The same logic can be used in a DATA step:

data want; do until (last.visit); set sorted; by id visit; array aaee(5) aa bb cc dd ee; array o_aaee(5) o_aa o_bb o_cc o_dd o_ee; do i = 1 to 5; o_aaee(i) = sum( o_aaee(i), aaee(i) ); end; end; do i = 1 to 5; if o_aaee(i)=2 then o_aaee(i) = 1; else o_aaee(i) = .; end; do until (last.visit); set sorted; by id visit; output; end; run;

Both the SQL and DATA step solutions would be less verbose and less complicated if the data structure were normalized (ie, if the data in variables AA through EE were stacked in one column).

Back to: Top of message | Previous page | Main SAS-L page