Date: Thu, 21 May 2009 04:42:18 -0700
Reply-To: pinu <amarmundankar@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: pinu <amarmundankar@GMAIL.COM>
Organization: http://groups.google.com
Subject: Re: Need help in selecting observations based on date variable
Content-Type: text/plain; charset=ISO-8859-1
On May 20, 11:51 pm, Nag <naga.amb...@gmail.com> wrote:
> I am working on health care data, I have to flag all claims for
> recipient based on following criteria.
>
> if there are more than two claims within a three month
> period or three claims within an 11 month period.
>
> Thanks in advance
/*Hi all,
I am not a proficient programmer but still I have found solution by
my own ways. There can be more efficient method than this.
If you find any bug do report it to me. Also if there are any ways
we can minimize the code, do tell me so.
How did I reach the solution ? Following are the steps:
Step 1: Sort the original Dataset.
Step 2: Just select those ids which have more than one claim.
Step 3: In this step, I created a flag variable which is containing
the all the ids which are following one of the
following condition:
a ) Ids which are claiming for more than 1 time within 3 months.
And
b ) Ids which are claiming for more than 2 times within 11
months.
Step 4: Now flag variable conains the value like (1, 1, 2). I
created a table with one variable as Id containing values (1,1,2)
Step 5: I selected only distict ids and put them in a table with
only one variable which is id.
Step 6: In this step, I selected all the records from table created
in Step 2, which are having same Ids as in table created in Step 5.
*/
/*This is the original data set*/
data main;
infile datalines;
input id date mmddyy10.;
datalines;
1 1/30/2009
2 2/15/2009
5 1/1/2009
1 4/15/2009
7 7/1/2009
9 1/20/2009
9 11/20/2009
1 7/10/2009
2 3/18/2009
;
proc print data = main;
run;
/* Step 1: Sort the original Dataset. */
proc sort data = main out= sorted_main;
by id ;
run;
proc print data= sorted_main;
run;
/* Step 2: Just select those ids which have more than one claim.*/
proc sql;
create table table2 as select * from sorted_main group by id having
count(id) > 1;
quit;
proc print data = table2;
run;
/*
Step 3: In this step, I created a flag variable which is containing
the all the ids which satisfy one of the
following condition:
a ) Ids which are claiming for more than 1 time within 3 months.
And
b ) Ids which are claiming for more than 2 times within 11
months.
*/
data table_with_main_logic ;
length flag $300;
retain flag ;
retain firstdate firstid;
retain countid 0;
set table2;
by id;
if first.id then do;
countid = 1;
firstdate = date;
firstid = id;
end;
else do;
countid = 1 + countid;
dtdiff = intck('month',date,firstdate);
*put dtdiff= ;
if abs(dtdiff) le 3 or (abs(dtdiff) ge 3 and countid ge 3) then
do ;
flag = catx(', ',flag,id);
*put flag=;
output;
end;
end;
run;
proc print data = table_with_main_logic;
run;
/*Step 4: Now flag variable conains the value like (1, 1, 2).
I created a table with one variable as Id containing values
(1,1,2) */
data table_with_dup_ids(keep = id);
set table_with_main_logic end= last nobs = numobs;
if last;
length a 8;
do i = 1 to numobs;
id = scan(flag,i);
output;
*put id= ;
end;
run;
proc print data = table_with_dup_ids;
run;
/* Step 5: I selected only distict ids and put them in a table with
only one variable which is id. */
proc sql print;
create table table_with_distinct_ids as select distinct id from
table_with_dup_ids;
quit;
proc print data = table_with_distinct_ids;
run;
/*Step 6: In this step, I selected all the records from table created
in Step 2,
which are having same Ids as in table created in Step 5. */
proc sql;
select table2.id,date format = mmddyy10. from table2,
table_with_distinct_ids where table2.id = table_with_distinct_ids.id;
quit;
THANKS AND REGARDS,
AMAR MUNDANKAR....
|