|
Richard -
Thanks for the code. It seems to work well. I've been playing with it on the test data, as well as the actual data. There are some kinks I need to work out that I forgot about...not with the logic, but with limiting this to a certain age.
I'm going through the code & trying to understand what each line does...so I can understand what is going on & hopefully learn something! The 3rd "when" line is the challenge at the moment. But so far, this seems to do the trick!
Thanks.
-Kat
----- Original Message ----
From: Richard A. DeVenezia <rdevenezia@WILDBLUE.NET>
To: SAS-L@LISTSERV.UGA.EDU
Sent: Thursday, November 2, 2006 9:07:39 AM
Subject: Re: Is this possible in PROC SQL? Assigning value by an ID var
kat j wrote:
> Can the following be done using PROC SQL? I think it can – and I
> think it will be the easiest way to accomplish what I want.
>
> I have data that look like the following:
> HH Status Int
...
>
> The main thing I’m trying to do is determine status by HH. In some
> cases, it may depend on the value of Int.
>
> Here are the rules:
> (i) If at least one obs per HH has a value for status of 1, I want
> HHStatus to be 1, regardless of the value of Int.
>
> (ii) If all obs per HH has a value for status of 2, I want HHStatus
> to be 2.
>
> (iii) If at least one obs per HH has a value for status of 9, with a
> value of Int as 9, and all other obs for the HH has a value for
> status of 2, I want HHStatus to be 9.
>
> And I want the HHStatus to be applied to the every observation.
>
> So for instance, the result would look like the following:
> HH Status Int HHStatus
> 1 1 1 1
> 1 1 1 1
> 1 1 1 1
> 2 1 1 1
> 2 2 1 1
> 2 2 1 1
> 2 2 1 1
> 3 2 1 9
> 3 2 1 9
> 3 2 1 9
> 3 9 9 9
> 4 2 1 2
> 4 2 1 2
> 4 2 1 2
>
> I believe this covers all the scenarios I can think of.
>
> I could probably do this using data steps, first.var, last.var,
> agreement variable, etc, but I was thinking that if this can be done
> using PROC SQL it might be a lot neater & quicker.
You can use SUM (<logical-expression>) to count the number of occurrences of
some conditions within the by group.
When criteria are multi-column conditional, you can sometimes use
concatenation to perform the check. Such is the case for (iii) in your
problem. Always be sure to carefully evaluate your criteria to ensure they
are mutually execlusive, and if not, order your case expressions in order of
preferred precedence.
data foo;
input a b c;
rowid+1;
cards;
1 1 1
1 1 1
1 1 1
2 1 1
2 2 1
2 2 1
2 2 1
3 2 1
3 2 1
3 2 1
3 9 9
4 2 1
4 2 1
4 2 1
run;
proc sql;
create table foo2 as
select
*
, case
when sum(b=1) > 0 then 1
when min(b)=max(b) and max(b) = 2 then 2
when sum(catx(',',b,c)='9,9') + sum(b=2) = count(*) then 9
end as hhstatus
from foo
group by a
order by rowid
;
quit;
--
Richard A. DeVenezia
http://www.devenezia.com/
|