Date: Thu, 22 Mar 2012 11:05:04 -0400
Reply-To: Christopher Battiston <darth.pathos@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Christopher Battiston <darth.pathos@GMAIL.COM>
Subject: Re: Extract only numeric values from an alphanumeric column
Content-Type: text/plain; charset=ISO-8859-1
Hi Nat / Mary,
Thanks for the replies! I had a two-fold plan that I was aiming to
1) Because 90%+ of our lab values are manually entered, I was aiming to do
a mini-audit and see the max / min / averages / counts;
2) Dealing with Transplant, our patients are on immunosuppresants,
anti-hypertensives, etc. etc. and I was looking to graph by organ program
and see if there are any trends over the past few years.
Having read your replies, I'm thinking I'll do a PROC FREQ for the
non-numeric results, with bar charts accordingly.
As I just wrote the line above, is there a way to set up bins by group? As
an example, I have test A with 1, 1, 2, 3, 9 as results; I would want bins
of 1 - 2, 3 - 7, 8+. For Test B, where I have 0.1, 0.4, 0.8, 2.6, 12.9,
I'd want 0 - 1, 1.1-2.5, 2.5+. Understanding of course the bins would have
to be medically-determined, but I'm just wondering if there is a way for
now to set up SAS to auto-bin values based on the values in the group.
Thanks so much for your time
On Thu, Mar 22, 2012 at 10:53 AM, Mary <firstname.lastname@example.org> wrote:
> If these are medical lab results then you may want to do substitution
> based on which lab it is; for example; some labs could have values and
> others would have negative/positive. To do a blanket substition of numeric
> when some lab results are not by nature numeric would be a mistake; you
> would at least want to run transformations separately within each type of
> lab. A hemotocrit score, for instance, would have a value whereas a test
> for a presence of a virus would have POS or NEG.
> --- darth.pathos@GMAIL.COM wrote:
> From: Christopher Battiston <darth.pathos@GMAIL.COM>
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Extract only numeric values from an alphanumeric column
> Date: Thu, 22 Mar 2012 08:06:13 -0400
> Hi all,
> I have a Labs result table where the "Results" column has all sorts of data
> (POS, <3, 1+, etc.). I would like to somehow extract only the numeric data
> so I can graph the results but I'm not entirely sure how. I was hoping
> that I could use if the first character was a number or a decimal, but that
> doesn't work because I have results like the 1+ in the examples, and then
> others like .3a (which I believe is a typo but data quality is a challenge
> left for another day).
> I could use PROC SQL to do a "NOT LIKE '%a%" for all the letters and
> characters but that is just ridiculous. I went to Cody's Data Cleaning
> Techniques (rapidly becoming one of my most used books in my SAS
> collection) and found the NOTDIGIT function.
> I created a table in WORK using PROC SQL (I want only this year's data) and
> had set my DATA step as
> data _null_;
> set work.lab;
> file print;
> if notdigit(trim(result)) and not missing(result)
> then put result= result_date=;
> The problem is that it returns what appears to be my complete dataset (1.0,
> 32, >1000, NEG, etc. etc.). So, what am I missing? What I'm thinking is
> creating a data set with the results from the DATA step, and then delete
> those results from WORK.LAB.
> Any suggestions would be appreciated.