Date: Mon, 20 Apr 2009 17:41:07 -0500
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: Parse log contained in one variable
Content-Type: text/plain; charset=ISO-8859-1
Here is one solution using only PRX patterns (and please note I am not
particularly skilled at perl regexps, so don't laugh at my simplistic
patterns). I use different variables for every single element's
position/length, but that's not necessary, I don't think; I just did it for
clarity (to show what they hold). Make sure everythings' length is
appropriate to what it might hold, of course.
input str $200.;
infile datalines truncover;
4/20/2009 15:46:13 John Smith: I processed transaction foo
4/19/2009 13:10:09 John Doe: Customer asked us to process transaction foo
format date DATE9.;
format time TIME8.;
format name $50.;
format text $450.;
patternid_date = prxparse('/\d\d?\/\d\d?\/\d\d\d\d/');
patternid_time = prxparse('/\d\d?\:\d\d\:\d\d/');
patternid_name = prxparse('/[A-Za-z]+ [A-Za-z]+\:/');
On Mon, Apr 20, 2009 at 5:12 PM, Joe Matise <firstname.lastname@example.org> wrote:
> You should use either perl reg exps (PRX* functions) or some variation
> SPLIT in python is most similarly SCAN in SAS. SCAN lets you split a
> string up by a delimiter. In the strings you attach, SPLIT(str,1,' ') gives
> you the date, split(str,2,' ') gives you the time, and the rest should be
> read with substr or PRX functions. Unless you really have \n in there
> anyway, which would be useful.
> In general, see
> for the PRX functions.
> On Mon, Apr 20, 2009 at 4:55 PM, Andrew Z. <email@example.com> wrote:
>> I recently started to learn SAS (9.1/Windows), and now I need to parse
>> a log. Read from an ODBC source (with a poor design which I can't
>> change), each person has a single log with multiple events crammed in
>> a single variable. I want to break apart the log into multiple
>> observations and multiple variables, so I can use it like a database.
>> I've seen how to do this in SAS from a text file (INFILE/CARDS) but
>> not from ODBC. Please point me in the right direction.
>> An example of the log for one person
>> 4/20/2009 15:46:13 John Smith: I processed transaction foo
>> 4/19/2009 13:10:09 John Doe: Customer asked us to process transaction
>> If I were using a general purpose language like Python, I would use
>> split('\n') to split the log into multiple variables. Then, I would
>> parse out the date, time, creator's name, and comment using a POSIX or
>> Perl regular expression. Then, I would store the parsed data in a new
>> database table.