Date: Thu, 11 Feb 2010 10:16:09 -0500
Reply-To: Arthur Tabachneck <art297@NETSCAPE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Arthur Tabachneck <art297@NETSCAPE.NET>
Subject: Re: Is there a better way to do this?
Dave,
Did you submit Mike's code as written or did you originally change the
submit line from:
>> %include dx_check / source2;
to
>> %include dx_check.sas / source2;
Art
---------
On Thu, 11 Feb 2010 09:26:08 -0500, Dave Brewer <david.brewer@UC.EDU>
wrote:
>Hi Mike,
>
>Thanks much for your suggestion; it's pretty much what I was hoping for.
>
>When you create "data _null_" for the include (as I will have 50 icd9
>codes), I noticed that you didn't put an extension of ".sas" on your
>filename.
>
>When I executed your code, SAS complained that it couldn't
>find "dx_check.sas"; when I renamed the file in the null step
>to "dx_check.sas", it worked. Does this make sense?
>
>Thanks again for your help!
>Dave
>
>On Thu, 11 Feb 2010 08:49:20 -0500, Mike Zdeb <msz03@ALBANY.EDU> wrote:
>
>>hi ... I'm sure there will be lots of ideas, here's one that's pretty
>simple ...
>>
>>data have;
>>length id icd9 $5 ;
>>input id $ icd9 $;
>>cards;
>>1 33333
>>1 22
>>1 49300
>>2 356
>>2 123
>>3 49321
>>4 49300
>>4 49301
>>4 49302
>>;
>>run;
>>
>>data want;
>>do until (last.id);
>> set have;
>> by id;
>> dx49300 = sum(dx49300,(icd9 eq '49300'));
>> dx49301 = sum(dx49301,(icd9 eq '49301'));
>> dx49302 = sum(dx49302,(icd9 eq '49302'));
>>end;
>>exclusion = (sum(of dx:) gt 0);
>>drop icd9;
>>run;
>>
>>proc print data=want noobs;
>>run;
>>
>>output ...
>>
>>id dx49300 dx49301 dx49302 exclusion
>>1 1 0 0 1
>>2 0 0 0 0
>>3 0 0 0 0
>>4 1 1 1 1
>>
>>
>>
>>if you have LOTS of DX codes to check, you can have SAS write the SAS
>code for you
>>then use %include to add the code to the data step
>>if you don't want to see all the included statements in the LOG, get rid
>of the SOURCE2 option
>>
>>put all your dx codes in the datalines file and each one produces a
>statement for inclusion later ...
>>
>>
>>* use a data step to write the necessary SAS code
>>filename dx_check temp;
>>data _null_;
>>input dx : $5. @@;
>>file dx_check;
>>text = catt('dx', dx , '=sum(dx' , dx , ',(icd9 eq : "' , dx , '"));' );
>>put text;
>>datalines;
>>49300 49301 49302
>>;
>>run;
>>
>>data want;
>>do until (last.id);
>> set have;
>> by id;
>> %include dx_check / source2;
>>end;
>>exclusion = (sum(of dx:) gt 0);
>>drop icd9;
>>run;
>>
>>proc print data=want noobs;
>>run;
>>
>>output ...
>>
>>id dx49300 dx49301 dx49302 exclusion
>>1 1 0 0 1
>>2 0 0 0 0
>>3 0 0 0 0
>>4 1 1 1 1
>>
>>
>>
>>
>>--
>>Mike Zdeb
>>U@Albany School of Public Health
>>One University Place
>>Rensselaer, New York 12144-3456
>>P/518-402-6479 F/630-604-1475
>>
>>> I am looking to this group for suggestions to improve or suggest a
>better
>>> way to come up with a solution to my problem.
>>>
>>> Background:
>>> I have one record per icd9 code per patient id. I have a list of 50
icd9
>>> codes that I need to see if the patient ever was diagnosed with that
>code by
>>> creating a new variable (0=no, 1=yes) for each wanted code. Output only
>one
>>> record per patient id containing only those new 0/1 variables and id.
>>>
>>> Example (for sake of brevity, I am only looking for 3 specific icd9
>codes,
>>> but in the real run I will be looking for 50):
>>>
>>> I am searching the records for icd9 code = 49300, 49301 and 49302 (icd9
>is a
>>> character variable). Create another new variable "exclusion" if we
>found any
>>> of the specific icd9 codes. Output only one record per patient id.
>>>
>>> data have;
>>> length id icd9 $5 ;
>>> input id $ icd9 $;
>>> cards;
>>> 1 33333
>>> 1 22
>>> 1 49300
>>> 2 356
>>> 2 123
>>> 3 49321
>>> 4 49300
>>> 4 49301
>>> 4 49302
>>> ;
>>> run;
>>>
>>> data want;
>>> set have;
>>> by id;
>>>
>>> retain dx49300 dx49301 dx49302 0;
>>> if first.id then
>>> do;
>>> dx49300 = 0;
>>> dx49301 = 0;
>>> dx49302 = 0;
>>> end;
>>>
>>> if icd9 = '49300' then dx49300 = 1 ;
>>> else if icd9 = '49301' then dx49301 = 1 ;
>>> else if icd9 = '49302' then dx49302 = 1 ;
>>>
>>> if sum(of dx49300 dx49301 dx49302) > 0 then exclusion = 1;
>>> else exclusion = 0;
>>>
>>> drop icd9;
>>> if last.id then output;
>>> run;
>>>
>>> proc print data=want noobs;
>>> run;
>>>
>>> As you can see, this can get quite tedious when I need to check for 50
>>> codes. What is a better, more efficient way of solving my problem?
>>>
>>> Thanks much.
>>> Dave
>>>
|