|
Master Ian in a private mail to me pointed out that Sheila's
intention to sum up 9 and 7 might be because 2/1/01-2/28/01
and 3/1/01-4/4/01 are a continuous unbroken period. If there
is a gap, say 2/1/01-2/28/01 and 3/10/01-4/4/01, she might
want to leave them alone without combining them. If this is
indeed the intention of her, my previous code will fail.
To solve this problem, I have to flag those record with
consecutive date as same grp number, then in the final proc sql,
add this new var grp in the group by clause:
data d1;
input ID Date1 :mmddyy8. Date2 :mmddyy8. Var1 $;
format date1 date2 mmddyy8.;
cards;
62 01/15/2001 02/01/2001 W
62 03/01/2001 04/01/2001 M
62 04/01/2001 06/01/2001 C
65 02/01/2001 02/28/2001 M
65 03/05/2001 04/04/2001 M
65 04/05/2001 04/10/2001 M
65 04/05/2001 04/15/2001 B
;
data d2;
input ID Date3 :mmddyy8. Var2;
format date3 mmddyy8.;
cards;
62 01/20/2001 6
62 02/01/2001 4
62 03/01/2001 1
65 02/01/2001 9
65 04/04/2001 7
65 04/07/2001 4
65 05/01/2001 8
;
options nocenter;
proc sort data=d1;
by id var1 date1;
data d1 (drop=start end);
set d1;
by id var1;
retain start end;
format start end mmddyy8.;
if first.var1 then do;
start = date1;
end = date2;
grp=0;
end;
if not (date1=end or date1=end+1) then grp+1;
start = date1;
end = date2;
RUN;
proc sql;
select distinct coalesce(d1.id,d2.id) as id, var1, sum(var2) as var2
from d1 right join d2
on d1.id=d2.id and date1<=date3<=date2
group by d1.id, var1, grp
;
------------------------
id Var1 var2
----------------------------
62 M 1
62 W 10
65 8
65 B 4
65 M 9
65 M 11
Note that 9 and 7 were not summed up, but 7 and 4
were summed up to 11, because they were
in the same period, 3/5/01-4/4/01-4/5/01-4/10/01.
Ya
-----Original Message-----
From: Sheila Whitelaw [mailto:sw75@BOXFROG.COM]
Sent: Thursday, April 04, 2002 11:22 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Merge by range
Hi, I have the following two datasets:
Input dataset 1:
OBS ID Date1 Date2 Var1
1 62 01/15/2001 02/01/2001 W
2 62 03/01/2001 04/01/2001 M
3 62 04/01/2001 06/01/2001 C
4 65 02/01/2001 02/28/2001 M
5 65 03/01/2001 04/04/2001 M
6 65 04/05/2001 04/15/2001 B
Input dataset 2:
OBS ID Date3 Var2
1 62 01/20/2001 6
2 62 02/01/2001 4
3 62 03/01/2001 1
4 65 02/01/2001 9
5 65 04/04/2001 7
6 65 05/01/2001 8
Result:
OBS ID Var1 Var2 Explanation
1 62 W 10 (6+4) because date3 is between date1 and
date2
2 62 M 1
3 65 M 16 (9+7) because date3 is between date1 and
date2
for the same ID and Var1
4 65 8
Does anyone know how to accomplish the above merging using SAS?
Thanks for any help.
|