Date: Thu, 11 Feb 2010 07:02:53 -0800
Reply-To: mlhoward@avalon.net
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mary <mlhoward@AVALON.NET>
Subject: Re: Is there a better way to do this?
Content-Type: text/plain; charset="UTF-8"
Mike,
This is a very interesting idea. I was wondering if your idea could be expanded to include a format; we are often dealing with diagnosis lists that are very numerous combined into a few categories, such as diabetes, where we are doing put(icd9,$diagfmt.):
proc format;
value $diagsfmt
'7805'-'78059',
'3272'-'32729'='Sleep Disturbance'
'3273'-'32735',
'32737'-'32739'='Other Sleep'
other='Other';
run;
data want;
do until (last.id);
set have;
by id;
sleep_disturbance = sum(sleep_disturbance ,(put(icd9,$diagsfmt.) eq 'Sleep Disturbance'));
other_sleep = sum(other_sleep,(put(icd9,$diagsfmt.) eq 'Other Sleep'));
end;
exclusion = (sum(sleep_distubance, other_sleep) gt 0);
drop icd9;
run;
NOT tested- just interested in whether this would work.
-Mary
--- msz03@ALBANY.EDU wrote:
From: Mike Zdeb <msz03@ALBANY.EDU>
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Is there a better way to do this?
Date: Thu, 11 Feb 2010 08:49:20 -0500
hi ... I'm sure there will be lots of ideas, here's one that's pretty simple ...
data have;
length id icd9 $5 ;
input id $ icd9 $;
cards;
1 33333
1 22
1 49300
2 356
2 123
3 49321
4 49300
4 49301
4 49302
;
run;
data want;
do until (last.id);
set have;
by id;
dx49300 = sum(dx49300,(icd9 eq '49300'));
dx49301 = sum(dx49301,(icd9 eq '49301'));
dx49302 = sum(dx49302,(icd9 eq '49302'));
end;
exclusion = (sum(of dx:) gt 0);
drop icd9;
run;
proc print data=want noobs;
run;
output ...
id dx49300 dx49301 dx49302 exclusion
1 1 0 0 1
2 0 0 0 0
3 0 0 0 0
4 1 1 1 1
if you have LOTS of DX codes to check, you can have SAS write the SAS code for you
then use %include to add the code to the data step
if you don't want to see all the included statements in the LOG, get rid of the SOURCE2 option
put all your dx codes in the datalines file and each one produces a statement for inclusion later ...
* use a data step to write the necessary SAS code
filename dx_check temp;
data _null_;
input dx : $5. @@;
file dx_check;
text = catt('dx', dx , '=sum(dx' , dx , ',(icd9 eq : "' , dx , '"));' );
put text;
datalines;
49300 49301 49302
;
run;
data want;
do until (last.id);
set have;
by id;
%include dx_check / source2;
end;
exclusion = (sum(of dx:) gt 0);
drop icd9;
run;
proc print data=want noobs;
run;
output ...
id dx49300 dx49301 dx49302 exclusion
1 1 0 0 1
2 0 0 0 0
3 0 0 0 0
4 1 1 1 1
--
Mike Zdeb
U@Albany School of Public Health
One University Place
Rensselaer, New York 12144-3456
P/518-402-6479 F/630-604-1475
> I am looking to this group for suggestions to improve or suggest a better
> way to come up with a solution to my problem.
>
> Background:
> I have one record per icd9 code per patient id. I have a list of 50 icd9
> codes that I need to see if the patient ever was diagnosed with that code by
> creating a new variable (0=no, 1=yes) for each wanted code. Output only one
> record per patient id containing only those new 0/1 variables and id.
>
> Example (for sake of brevity, I am only looking for 3 specific icd9 codes,
> but in the real run I will be looking for 50):
>
> I am searching the records for icd9 code = 49300, 49301 and 49302 (icd9 is a
> character variable). Create another new variable "exclusion" if we found any
> of the specific icd9 codes. Output only one record per patient id.
>
> data have;
> length id icd9 $5 ;
> input id $ icd9 $;
> cards;
> 1 33333
> 1 22
> 1 49300
> 2 356
> 2 123
> 3 49321
> 4 49300
> 4 49301
> 4 49302
> ;
> run;
>
> data want;
> set have;
> by id;
>
> retain dx49300 dx49301 dx49302 0;
> if first.id then
> do;
> dx49300 = 0;
> dx49301 = 0;
> dx49302 = 0;
> end;
>
> if icd9 = '49300' then dx49300 = 1 ;
> else if icd9 = '49301' then dx49301 = 1 ;
> else if icd9 = '49302' then dx49302 = 1 ;
>
> if sum(of dx49300 dx49301 dx49302) > 0 then exclusion = 1;
> else exclusion = 0;
>
> drop icd9;
> if last.id then output;
> run;
>
> proc print data=want noobs;
> run;
>
> As you can see, this can get quite tedious when I need to check for 50
> codes. What is a better, more efficient way of solving my problem?
>
> Thanks much.
> Dave
>