Date: Thu, 11 Feb 2010 07:55:47 -0500
Reply-To: Nathaniel Wooding <nathaniel.wooding@DOM.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Nathaniel Wooding <nathaniel.wooding@DOM.COM>
Subject: Re: Is there a better way to do this?
In-Reply-To: <201002111045.o1B0ECWW005681@malibu.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"
Dave
Here is another way to do this. The one caveat is that the values in the final set are either 1 or missing but you could take care of that with an array and a simple loop and assignment.
Nat Wooding
data have;
length id icd9 $5 ;
input id $ icd9 $;
cards;
1 33333
1 22
1 49300
2 356
2 123
3 49321
4 49300
4 49301
4 49302
;
Proc Format;
value $match
'49300'='1'
'49301'='1'
'49302' ='1'
;
Data MT;* I.e., an empty data set;
retain dx49300 dx49301 dx49302 ;
run;
Data Want;
set have;
if Put( icd9 , Match5.) = icd9 then delete;
retain X 1 ;
run;
Proc Transpose data = want out = have ( drop = _name_ )Prefix = DX;
by id;
ID ICD9;
var x;
run;
Data Final;
set MT Have ;
if id gt '';
run;
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Dave
Sent: Thursday, February 11, 2010 5:46 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Is there a better way to do this?
I am looking to this group for suggestions to improve or suggest a better
way to come up with a solution to my problem.
Background:
I have one record per icd9 code per patient id. I have a list of 50 icd9
codes that I need to see if the patient ever was diagnosed with that code by
creating a new variable (0=no, 1=yes) for each wanted code. Output only one
record per patient id containing only those new 0/1 variables and id.
Example (for sake of brevity, I am only looking for 3 specific icd9 codes,
but in the real run I will be looking for 50):
I am searching the records for icd9 code = 49300, 49301 and 49302 (icd9 is a
character variable). Create another new variable "exclusion" if we found any
of the specific icd9 codes. Output only one record per patient id.
data have;
length id icd9 $5 ;
input id $ icd9 $;
cards;
1 33333
1 22
1 49300
2 356
2 123
3 49321
4 49300
4 49301
4 49302
;
run;
data want;
set have;
by id;
retain dx49300 dx49301 dx49302 0;
if first.id then
do;
dx49300 = 0;
dx49301 = 0;
dx49302 = 0;
end;
if icd9 = '49300' then dx49300 = 1 ;
else if icd9 = '49301' then dx49301 = 1 ;
else if icd9 = '49302' then dx49302 = 1 ;
if sum(of dx49300 dx49301 dx49302) > 0 then exclusion = 1;
else exclusion = 0;
drop icd9;
if last.id then output;
run;
proc print data=want noobs;
run;
As you can see, this can get quite tedious when I need to check for 50
codes. What is a better, more efficient way of solving my problem?
Thanks much.
Dave
CONFIDENTIALITY NOTICE: This electronic message contains
information which may be legally confidential and or privileged and
does not in any case represent a firm ENERGY COMMODITY bid or offer
relating thereto which binds the sender without an additional
express written confirmation to that effect. The information is
intended solely for the individual or entity named above and access
by anyone else is unauthorized. If you are not the intended
recipient, any disclosure, copying, distribution, or use of the
contents of this information is prohibited and may be unlawful. If
you have received this electronic transmission in error, please
reply immediately to the sender that you have received the message
in error, and delete it. Thank you.