LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 2 May 2007 09:19:33 -0400
Reply-To:   Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:   Re: Cleaning ICD9 codes
In-Reply-To:   <1178049878.018281.194260@y5g2000hsa.googlegroups.com>
Content-Type:   text/plain; charset="us-ascii"

Joey: The decimals in ICD9 codes do have meaning, but, if removed, you may be able to infer the level of the code from its length. If you don't have sufficient expertise in disease coding, consult with a coding specialist (not me!) before you attempt to standardize codes. I've decoded two very large sets of NDI COD results. Even the extensive documentation of ICD codes may not tell you all you need to know. For example, we learned from comparisons and frequencies that the source of codes may influence how they should be interpreted. Physicians coding cancer causes of death tended to choose 'Other' catch-all categories in contrast to tumor registries that tend to choose more specific diagnoses.

For most applications I'd skip past code standardization and move immediately to decoding the codes. Investigators may really need ICD-9 codes mapped to fairly broad categories. Solutions specific to pragmatics may prove much easier than general solutions. Even so, it took us many weeks to resolve all (at least almost all) of the oddities that we found in over a million ICD-9/10 codes. I regard anything along the same lines as a major data mining task.

It looks as if you are still in the early stages of the process. SAS-L might be able to offer more help in response to a clearer statement of objectives and context. S

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of jofo Sent: Tuesday, May 01, 2007 4:05 PM To: sas-l@uga.edu Subject: Cleaning ICD9 codes

Hi All,

I have some ICD9 codes that I would like to standardize before I search through.

Some have decimals, most do not.

I would like to remove the decimal or space in place. I tried this...

data nodot; set test; array dx{8} $ dx1-dx8; array dxcode{8};

do i=1 to 8; if length(dx{i})then do; dxcode{i} = prxchange('s/(?:(.+)((\.)|(\s))(\d+))/$1$3/',0, dx{i}); end; end; run;

This fails miserably though.

Anyone know of a way to remove a dot or space from the middle of the code? Anyone know if this is a bad idea and should be avoided?

Thanks.


Back to: Top of message | Previous page | Main SAS-L page