Date: Wed, 18 Aug 2010 19:07:19 -0400
Reply-To: Nat Wooding <nathani@VERIZON.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Nat Wooding <nathani@VERIZON.NET>
Subject: Re: OCR (was RE: BASE SAS CERTIFICATION)
In-Reply-To: <279E0CB25CBDEF4DB4A50E89328780E202638894@CHEXCMS08.one.ads.che.org>
Content-Type: text/plain; charset="US-ASCII"
Kevin
My experience with OCR is now a bit old but I can say that success will
depend on the quality of the images and whether there is smearing of the
lettering during the process. The poorer quality the text, the poorer the
result.
One thing that you might do in checking for spelling errors would be to use
SAS' Proc Spell. Barbara Okerson had presented a couple papers on obsolete
SAS procs and Spell is included. She shows how to create your own word lists
in case there are valid spellings that are not found in the SAS list. One of
her sets of slides can be found at
http://vasug.files.wordpress.com/2009/04/old-but-not-obsolete-v3.pdf
Nat Wooding
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Viel,
Kevin
Sent: Wednesday, August 18, 2010 5:10 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: OCR (was RE: BASE SAS CERTIFICATION)
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
> Michael Raithel
> Sent: Wednesday, August 18, 2010 1:09 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: BASE SAS CERTIFICATION
>
> Dear SAS-L-ers,
>
> Mike Zdeb posted the following:
>
> > hi ... if one wants to go back a bit further to "genesis" for
> > a discussion of what happens during a data step ...
> >
> > "The SAS Supervisor"
> > Don Henderson & Merry Rabb
> >
> > http://www.lexjansen.com/nesug/nesug88/sas_supervisor.pdf
> >
> > (PDF created by scanning my copy of the NESUG '88 proceedings,
> > so it's just an image and cannot be searched for text)
One can use an OCR program to enable this. I used Adobe, but one can find
freeware. I would love to hear comments about experience with this using
SAS to parse text. Our "eHR" consists of PDF-style documents. We have M4's
and nurses, among other highly skilled staff, parsing this manually after
searching for the respective file case-by-case :(
I think I am going to have a student or two tackle this, but the question is
HOW to get an electronic copy the preserves the form of the original,
especially when the document is only rendered and not *stored* as a PDF-like
file.
IS said they might be able to provide me with the print spool resulting from
their query.
Any comments are welcome. Wine is also warmly welcome.
-Kevin
Kevin Viel, PhD
Senior Research Statistician
Patient Safety & Quality
International College of Robotic Surgery
Saint Joseph's Translational Research Institute
Saint Joseph's Hospital
5671 Peachtree Dunwoody Road, NE, Suite 330
Atlanta, GA 30342
(678) 843-6076: Direct Phone
(678) 843-6153: Facsimile
(404) 558-1364: Mobile
kviel@sjha.org
Confidentiality Notice:
This e-mail, including any attachments is the
property of Catholic Health East and is intended
for the sole use of the intended recipient(s).
It may contain information that is privileged and
confidential. Any unauthorized review, use,
disclosure, or distribution is prohibited. If you are
not the intended recipient, please delete this message, and
reply to the sender regarding the error in a separate email.