Date: Thu, 11 Feb 2010 12:12:00 -0500
Reply-To: msz03@albany.edu
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mike Zdeb <msz03@ALBANY.EDU>
Subject: Re: Is there a better way to do this?
Content-Type: text/plain;charset=iso-8859-1
hi ... it should work ... here's a short test ...
the statements ...
dxone = ^^dxone;
dxtwo = ^^dxtwo;
convert any sums greater than 1 to 1 so you get 0/1 values
try this example without those statement and you get a value of 3
for variable DXONE for ID = 4
(how many times do you get to write ^^ in your SAS code ... ^ very often)
proc format;
value $dx
'49300'-'49302' = 'DX493..'
'33333' = 'DX33333'
other = 'NOT IN A GROUP'
;
run;
data have;
input id : $1. icd9 : $5. @@;
cards;
1 33333 1 22 1 49300
2 356 2 123
3 49321
4 49300 4 49301 4 49302
;
run;
data want;
do until (last.id);
set have;
by id;
dxone = sum(dxone,(put(icd9,$dx.) eq 'DX493..'));
dxtwo = sum(dxtwo,(put(icd9,$dx.) eq 'DX33333'));
end;
dxone = ^^dxone;
dxtwo = ^^dxtwo;
exclusion = (sum(of dx:) gt 0);
drop icd9;
run;
proc print data=want noobs;
run;
id dxone dxtwo exclusion
1 1 1 1
2 0 0 0
3 0 0 0
4 1 0 1
and probably not as efficient as a format, this works
you could also have SAS write the NOT NOT statements ...
dx1 = ^^dx1;
dx2 = ^^dx2;
but I left that out (exercise for the students !!!)
filename dx_check temp;
data _null_;
input dx : $5. group : $1. @@;
file dx_check;
text = catt('dx', group , '=sum(dx' , group , ',(icd9 eq : "' , dx , '"));' );
put text;
datalines;
49300 1 49301 1 49302 1 33333 2
;
run;
data want;
do until (last.id);
set have;
by id;
%include dx_check / source2;
end;
dx1 = ^^dx1;
dx2 = ^^dx2;
exclusion = (sum(of dx:) gt 0);
drop icd9;
run;
proc print data=want noobs;
run;
id dx1 dx2 exclusion
1 1 1 1
2 0 0 0
3 0 0 0
4 1 0 1
--
Mike Zdeb
U@Albany School of Public Health
One University Place
Rensselaer, New York 12144-3456
P/518-402-6479 F/630-604-1475
> Mike,
>
> This is a very interesting idea. I was wondering if your idea could be expanded to include a format; we are often dealing with diagnosis lists that
> are very numerous combined into a few categories, such as diabetes, where we are doing put(icd9,$diagfmt.):
>
> proc format;
> value $diagsfmt
> '7805'-'78059',
> '3272'-'32729'='Sleep Disturbance'
> '3273'-'32735',
> '32737'-'32739'='Other Sleep'
> other='Other';
> run;
>
> data want;
> do until (last.id);
> set have;
> by id;
> sleep_disturbance = sum(sleep_disturbance ,(put(icd9,$diagsfmt.) eq 'Sleep Disturbance'));
> other_sleep = sum(other_sleep,(put(icd9,$diagsfmt.) eq 'Other Sleep'));
> end;
> exclusion = (sum(sleep_distubance, other_sleep) gt 0);
> drop icd9;
> run;
>
>
> NOT tested- just interested in whether this would work.
>
> -Mary
>
>
>
> --- msz03@ALBANY.EDU wrote:
>
> From: Mike Zdeb <msz03@ALBANY.EDU>
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: Is there a better way to do this?
> Date: Thu, 11 Feb 2010 08:49:20 -0500
>
> hi ... I'm sure there will be lots of ideas, here's one that's pretty simple ...
>
> data have;
> length id icd9 $5 ;
> input id $ icd9 $;
> cards;
> 1 33333
> 1 22
> 1 49300
> 2 356
> 2 123
> 3 49321
> 4 49300
> 4 49301
> 4 49302
> ;
> run;
>
> data want;
> do until (last.id);
> set have;
> by id;
> dx49300 = sum(dx49300,(icd9 eq '49300'));
> dx49301 = sum(dx49301,(icd9 eq '49301'));
> dx49302 = sum(dx49302,(icd9 eq '49302'));
> end;
> exclusion = (sum(of dx:) gt 0);
> drop icd9;
> run;
>
> proc print data=want noobs;
> run;
>
> output ...
>
> id dx49300 dx49301 dx49302 exclusion
> 1 1 0 0 1
> 2 0 0 0 0
> 3 0 0 0 0
> 4 1 1 1 1
>
>
>
> if you have LOTS of DX codes to check, you can have SAS write the SAS code for you
> then use %include to add the code to the data step
> if you don't want to see all the included statements in the LOG, get rid of the SOURCE2 option
>
> put all your dx codes in the datalines file and each one produces a statement for inclusion later ...
>
>
> * use a data step to write the necessary SAS code
> filename dx_check temp;
> data _null_;
> input dx : $5. @@;
> file dx_check;
> text = catt('dx', dx , '=sum(dx' , dx , ',(icd9 eq : "' , dx , '"));' );
> put text;
> datalines;
> 49300 49301 49302
> ;
> run;
>
> data want;
> do until (last.id);
> set have;
> by id;
> %include dx_check / source2;
> end;
> exclusion = (sum(of dx:) gt 0);
> drop icd9;
> run;
>
> proc print data=want noobs;
> run;
>
> output ...
>
> id dx49300 dx49301 dx49302 exclusion
> 1 1 0 0 1
> 2 0 0 0 0
> 3 0 0 0 0
> 4 1 1 1 1
>
>
>
>
> --
> Mike Zdeb
> U@Albany School of Public Health
> One University Place
> Rensselaer, New York 12144-3456
> P/518-402-6479 F/630-604-1475
>
>> I am looking to this group for suggestions to improve or suggest a better
>> way to come up with a solution to my problem.
>>
>> Background:
>> I have one record per icd9 code per patient id. I have a list of 50 icd9
>> codes that I need to see if the patient ever was diagnosed with that code by
>> creating a new variable (0=no, 1=yes) for each wanted code. Output only one
>> record per patient id containing only those new 0/1 variables and id.
>>
>> Example (for sake of brevity, I am only looking for 3 specific icd9 codes,
>> but in the real run I will be looking for 50):
>>
>> I am searching the records for icd9 code = 49300, 49301 and 49302 (icd9 is a
>> character variable). Create another new variable "exclusion" if we found any
>> of the specific icd9 codes. Output only one record per patient id.
>>
>> data have;
>> length id icd9 $5 ;
>> input id $ icd9 $;
>> cards;
>> 1 33333
>> 1 22
>> 1 49300
>> 2 356
>> 2 123
>> 3 49321
>> 4 49300
>> 4 49301
>> 4 49302
>> ;
>> run;
>>
>> data want;
>> set have;
>> by id;
>>
>> retain dx49300 dx49301 dx49302 0;
>> if first.id then
>> do;
>> dx49300 = 0;
>> dx49301 = 0;
>> dx49302 = 0;
>> end;
>>
>> if icd9 = '49300' then dx49300 = 1 ;
>> else if icd9 = '49301' then dx49301 = 1 ;
>> else if icd9 = '49302' then dx49302 = 1 ;
>>
>> if sum(of dx49300 dx49301 dx49302) > 0 then exclusion = 1;
>> else exclusion = 0;
>>
>> drop icd9;
>> if last.id then output;
>> run;
>>
>> proc print data=want noobs;
>> run;
>>
>> As you can see, this can get quite tedious when I need to check for 50
>> codes. What is a better, more efficient way of solving my problem?
>>
>> Thanks much.
>> Dave
>>
>
|