Date: Tue, 10 Apr 2007 18:54:49 -0400
Reply-To: Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Subject: Re: finding a string within data records
Kea
Before you read further, it would be interesting to know why you need to
do this text manipulation, which seems to need little of SAS.
After explaining to SAS-L why you need to do this, and although you have
had reasonable answers already from SAS-L, please try the code
below, which I think is the shortest solution form .....>
Data _null_ ;
Infile 'xyz.dat' scanOver truncOver lrecl=10000 ;
Input @'ABC' string $10. @1 num $3. ;
File 'PRT.dat' ;
Put num $3. string ;
Run;
Perhaps the original provider of this test wants you to discover the
infile options truncOver and scanOver and their interaction with
the input style
input @'ABC' .................
input @'ABC'
positions the input pointer after the next occurence of 'ABC' on an input
record. (which is where you want the 10 character string read) Subject to
other infile options, this couls search through the whole file looking for
the next occurrence of 'ABC'.
truncOver
is relevant here because you need the $10. informat to read the $10 string
immediately following 'ABC' and at least one of these strings is truncated
by an end-of-line with less than 10 characters available for the informat.
In my experience, which may be platform and version dependent, if you
don't provide the informat at that point on the input statement, the
parsing of the line continues after the next blank space (or other
delimiter= value) following the 'ABC'.
scanOver
is relevant because it allows the
input @'ABC'
to scan forward over multiple lines in the input file until a line
with 'ABC' is found, while you also need TRUNCOVER behaviour which cancels
the FLOWOVER behaviour (FLOWOVER was needed in the days before SAS8
introduced scanOver to allow this scanning forward without FLOWOVER).
Lrecl=
may not be needed, but I allways insert this option as a reminder that the
default is 256 which is sometimes too short when parsing strange text
And, to all those who suffered my horrible message earlier that was just
UTF8 junk,
Sorry.
Peter Crawford
On Tue, 10 Apr 2007 04:33:27 -0700, kea2003@GMAIL.COM wrote:
>Good Day,
>
>I have an ASCII data file named XYZ.dat with variable record length,
>similar to this:
>
>00187JT8O58TABC98J5TI4U5984J5T
>002I4HTABC93T43IIRT309I
>00309J5GI8JJJJJJ485JG495JG94
>00498UABC9UYUY5
>
>The first 3 bytes is the record number.
>I'm trying to find the string "ABC" in each record and if found, I
>will write the following 10 bytes in another file named PRT, along
>with the record number.
>Following from the above data sample, the output file should look like
>this:
>
>00198J5TI4U59
>00293T43IIRT3
>0049UYUY5
>
>Please note that:
>- 'ABC' string in each record can have a different offset.
>- Record 003 has no ABC in it, so it is missing completely from the
>output file.
>- Record 004 has ABC in it, but rest of the record is less than 10
>bytes.
>
>I will be grateful if you could show me how to do this in SAS.
>Many thanks in advance...
>
>Kea.