LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 4 Jul 2007 15:08:44 -0400
Reply-To:     Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:      Re: Reducing a set of regex patterns used for matching?
Comments: To: "Richard A. DeVenezia" <rdevenezia@wildblue.net>
In-Reply-To:  <yacii.17$Tt.102501@news.sisna.com>
Content-Type: text/plain; charset="us-ascii"

Richard: Interesting question ... The degree of sensitivity of a pattern i relative to pattern j in these examples shows up in whether pattern i matches to pattern j or vice-versa; for example, pattern 5 dominates the other patterns on sensitivity, while pattern 0 dominates the other patterns on specificity:

data test; input @1 ptri 1. @5 string $char29. ; cards; 0 the three little pigs (title) 1 the.*?three.*?pig 2 little.*?pig 3 the.*?pig 4 e.*?little 5 e.*?i ; run; %macro testPattern(__index); %put index=&__index; proc sql noprint; select trim(string) into :__string from test where ptri=&__index ; quit; %put &__string; data patternMatch ; retain rxid; pattern="&__index"; rxid=prxParse("/%trim(&__string)/"); if missing(rxid) then do; putlog 'ERROR: malformed regex'; stop; end; set test; if (ptri^=&__index) and prxMatch(rxid,trim(string)) then output; run; %mend testPattern; %testPattern(5)

I have to wonder whether this test works correctly in general for perl regular expressions. It would probably fail to match some equivalent patterns. S

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Richard A. DeVenezia Sent: Monday, July 02, 2007 3:14 PM To: sas-l@uga.edu Subject: Reducing a set of regex patterns used for matching?

Suppose you are given a set of simplistic regex patterns (case insensitive and contain only .*? wildcarding) that are used for postive assertion in a broader associative mapping context.

0 the three little pigs (title) 1 the.*?three.*?pig 2 little.*?pig 3 the.*?pig 4 e.*?little

Is there a way to programmatically prune the set of filters 1-4 ?

For instance 1 could be removed because 3 would match everything 1 would.

-- Richard A. DeVenezia


Back to: Top of message | Previous page | Main SAS-L page