| Date: | Wed, 2 May 2007 09:19:33 -0400 |
| Reply-To: | Sigurd Hermansen <HERMANS1@WESTAT.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Sigurd Hermansen <HERMANS1@WESTAT.COM> |
| Subject: | Re: Cleaning ICD9 codes |
| In-Reply-To: | <1178049878.018281.194260@y5g2000hsa.googlegroups.com> |
| Content-Type: | text/plain; charset="us-ascii" |
Joey:
The decimals in ICD9 codes do have meaning, but, if removed, you may be
able to infer the level of the code from its length. If you don't have
sufficient expertise in disease coding, consult with a coding specialist
(not me!) before you attempt to standardize codes. I've decoded two very
large sets of NDI COD results. Even the extensive documentation of ICD
codes may not tell you all you need to know. For example, we learned
from comparisons and frequencies that the source of codes may influence
how they should be interpreted. Physicians coding cancer causes of death
tended to choose 'Other' catch-all categories in contrast to tumor
registries that tend to choose more specific diagnoses.
For most applications I'd skip past code standardization and move
immediately to decoding the codes. Investigators may really need ICD-9
codes mapped to fairly broad categories. Solutions specific to
pragmatics may prove much easier than general solutions. Even so, it
took us many weeks to resolve all (at least almost all) of the oddities
that we found in over a million ICD-9/10 codes. I regard anything along
the same lines as a major data mining task.
It looks as if you are still in the early stages of the process. SAS-L
might be able to offer more help in response to a clearer statement of
objectives and context.
S
-----Original Message-----
From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu]
On Behalf Of jofo
Sent: Tuesday, May 01, 2007 4:05 PM
To: sas-l@uga.edu
Subject: Cleaning ICD9 codes
Hi All,
I have some ICD9 codes that I would like to standardize before I search
through.
Some have decimals, most do not.
I would like to remove the decimal or space in place.
I tried this...
data nodot;
set test;
array dx{8} $ dx1-dx8;
array dxcode{8};
do i=1 to 8;
if length(dx{i})then do;
dxcode{i} = prxchange('s/(?:(.+)((\.)|(\s))(\d+))/$1$3/',0,
dx{i});
end;
end;
run;
This fails miserably though.
Anyone know of a way to remove a dot or space from the middle of the
code? Anyone know if this is a bad idea and should be avoided?
Thanks.
|