Date: Thu, 22 Mar 2012 09:14:17 -0700
Reply-To: mlhoward@avalon.net
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mary <mlhoward@AVALON.NET>
Subject: Re: Extract only numeric values from an alphanumeric column
Content-Type: text/plain; charset="UTF-8"
Better to create a new variable that transforms them; in a data set try somthing like this
proc format;
value $testafmt. = ;
data ...;
informat lab_result_formatted $50.;
if testname='Test A' then lab_result_formatted=put(lab_result, $testafmt.);
else if testname='Test B' then lab_result_formatted=put(lab_result, $testbfmt.);
and so forth.
Then when you are running frequencies; run them within the type of labtest, such as
proc freq data= ;
tables lab_result_formatted;
by testname;
run;
Not tested.
-Mary
--- darth.pathos@GMAIL.COM wrote:
From: Christopher Battiston <darth.pathos@GMAIL.COM>
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Extract only numeric values from an alphanumeric column
Date: Thu, 22 Mar 2012 11:11:25 -0400
Cool! Thanks Joe! It's save to assume then that I can set up different
grouped bins for different labs, and then just specify each in my proc freq?
For example
proc freq data=YOURDATA;
format Test_A bin8.
format Test_B bin7.
format Test_C bin12.;
tables YOURVAR;
run;
Have a great day
Chris
On Thu, Mar 22, 2012 at 11:07 AM, Joe Matise <snoopy369@gmail.com> wrote:
> You can set up a PROC FORMAT to quickly bin the results.
>
> proc format;
> value bin8
> 1-2 = "1-2"
> 3-7 = "3-7"
> 8-high="8+"
> other="OUT OF RANGE";
> quit;
>
> proc freq data=YOURDATA;
> format YOURVAR bin8.;
> tables YOURVAR;
> run;
>
> -Joe
>
>
> On Thu, Mar 22, 2012 at 10:05 AM, Christopher Battiston <
> darth.pathos@gmail.com> wrote:
>
>> Hi Nat / Mary,
>>
>> Thanks for the replies! I had a two-fold plan that I was aiming to
>> accomplish:
>> 1) Because 90%+ of our lab values are manually entered, I was aiming to do
>> a mini-audit and see the max / min / averages / counts;
>> 2) Dealing with Transplant, our patients are on immunosuppresants,
>> anti-hypertensives, etc. etc. and I was looking to graph by organ program
>> and see if there are any trends over the past few years.
>>
>> Having read your replies, I'm thinking I'll do a PROC FREQ for the
>> non-numeric results, with bar charts accordingly.
>>
>> As I just wrote the line above, is there a way to set up bins by group?
>> As
>> an example, I have test A with 1, 1, 2, 3, 9 as results; I would want bins
>> of 1 - 2, 3 - 7, 8+. For Test B, where I have 0.1, 0.4, 0.8, 2.6, 12.9,
>> I'd want 0 - 1, 1.1-2.5, 2.5+. Understanding of course the bins would
>> have
>> to be medically-determined, but I'm just wondering if there is a way for
>> now to set up SAS to auto-bin values based on the values in the group.
>>
>> Thanks so much for your time
>> Chris
>>
>> On Thu, Mar 22, 2012 at 10:53 AM, Mary <mlhoward@avalon.net> wrote:
>>
>> > If these are medical lab results then you may want to do substitution
>> > based on which lab it is; for example; some labs could have values and
>> > others would have negative/positive. To do a blanket substition of
>> numeric
>> > when some lab results are not by nature numeric would be a mistake; you
>> > would at least want to run transformations separately within each type
>> of
>> > lab. A hemotocrit score, for instance, would have a value whereas a
>> test
>> > for a presence of a virus would have POS or NEG.
>> >
>> > -Mary
>> >
>> > --- darth.pathos@GMAIL.COM wrote:
>> >
>> > From: Christopher Battiston <darth.pathos@GMAIL.COM>
>> > To: SAS-L@LISTSERV.UGA.EDU
>> > Subject: Extract only numeric values from an alphanumeric column
>> > Date: Thu, 22 Mar 2012 08:06:13 -0400
>> >
>> > Hi all,
>> >
>> > I have a Labs result table where the "Results" column has all sorts of
>> data
>> > (POS, <3, 1+, etc.). I would like to somehow extract only the numeric
>> data
>> > so I can graph the results but I'm not entirely sure how. I was hoping
>> > that I could use if the first character was a number or a decimal, but
>> that
>> > doesn't work because I have results like the 1+ in the examples, and
>> then
>> > others like .3a (which I believe is a typo but data quality is a
>> challenge
>> > left for another day).
>> >
>> > I could use PROC SQL to do a "NOT LIKE '%a%" for all the letters and
>> > characters but that is just ridiculous. I went to Cody's Data Cleaning
>> > Techniques (rapidly becoming one of my most used books in my SAS
>> > collection) and found the NOTDIGIT function.
>> >
>> > I created a table in WORK using PROC SQL (I want only this year's data)
>> and
>> > had set my DATA step as
>> >
>> > data _null_;
>> > set work.lab;
>> > file print;
>> > if notdigit(trim(result)) and not missing(result)
>> > then put result= result_date=;
>> > run;
>> >
>> > The problem is that it returns what appears to be my complete dataset
>> (1.0,
>> > 32, >1000, NEG, etc. etc.). So, what am I missing? What I'm thinking
>> is
>> > creating a data set with the results from the DATA step, and then delete
>> > those results from WORK.LAB.
>> >
>> > Any suggestions would be appreciated.
>> > Chris
>> >
>> >
>> >
>>
>
>
|