| Date: | Thu, 19 Oct 2006 12:19:01 -0700 |
| Reply-To: | David L Cassell <davidlcassell@MSN.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | David L Cassell <davidlcassell@MSN.COM> |
| Subject: | Re: wild card |
| In-Reply-To: | <1161278649.833973.159340@b28g2000cwb.googlegroups.com> |
| Content-Type: | text/plain; format=flowed |
|---|
jessica.donato@GMAIL.COM wrote back:
>"Nordlund, Dan DSHS wrote:
> > > -----Original Message-----
> > > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
>toby
> > > dunn
> > > Sent: Thursday, October 19, 2006 8:31 AM
> > > To: SAS-L@LISTSERV.UGA.EDU
> > > Subject: Re: wild card
> > >
> > > Philly ,
> > >
> > > Yeah you cant use the % the way you want here.
> > >
> > > Try this:
> > >
> > > If DrugProcSite In: ( "Mast" "modified" "mrm" "mat" "quad" ) Then
>V009
> > > =
> > > 1 ;
> > > Else
> > > V009 = 2 ;
> > >
> > >
> > >
> > >
> > > Toby Dunn
> > >
> > > The obscure we see eventually. The completely obvious, it seems,
>takes
> > > longer. ~Edward R. Murrow
> > >
> > >
> > > Think like a man of action, act like a man of thought. ~Henri Louis
> > > Bergson
> > >
> > >
> > >
> > > Alice came to a fork in the road. "Which road do I take?" she asked.
> > > "Where do you want to go?" responded the Cheshire cat.
> > > "I don't know," Alice answered.
> > > "Then," said the cat, "it doesn't matter."
> > > ~Lewis Carroll, Alice in Wonderland
> > >
> > >
> > >
> > >
> > >
> > >
> > > From: phillyj <jessica.donato@GMAIL.COM>
> > > Reply-To: phillyj <jessica.donato@GMAIL.COM>
> > > To: SAS-L@LISTSERV.UGA.EDU
> > > Subject: wild card
> > > Date: Thu, 19 Oct 2006 08:23:45 -0700
> > >
> > > Hi everyone:
> > >
> > > Im trying to use a wild card to select all observations with the
> > > specified letters in them. For example my code is:
> > >
> > > Data SurgeryPrim;
> > > Set PSurgery;
> > > if DrugProcSite in ("Mast%", "modified%", "mrm%", "mat%", "quad%")
>then
> > > v009=1; else v009=2;
> > > run;
> > >
> > > This code is not working though. Any suggestions for what i did
>wrong?
> > > I also tried to remove the & and keep only quotes.
> > >
> > > Thanks for your help!
> > >
> >
> > The colon modifier, as suggested by Toby, is a useful tool in many
>situations like this. Just make sure that it is doing what you want. The
>stings are compared only up to the shorter of the two strings being
>compared.
> >
> > DrugProcSite in: ("Mast", "modified", "mrm", "mat", "quad")
> >
> > will evaluate as true if DrugProcSite is "Master" as well as if it is
>"Ma". Also remember that string compares are case sensitive, so you may
>want to UPCASE(DrugProcSite) and make your comparison strings upper case if
>case doesn't matter.
> >
> > Hope this is helpful,
> >
> > Dan
> >
> > Daniel J. Nordlund
> > Research and Data Analysis
> > Washington State Department of Social and Health Services
> > Olympia, WA 98504-5204
>
>Thanks so much for your suggestions! The dilemna I am running into now
>is that all observations begining with the defined values are properly
>being coded however those observation that are within text are not.
>For example:
>
>"Mast" is properly being coded but "Secondary Mast" is not. Does this
>make sense? Is there a way to expand the search through the text?
>
>Thanks!
>
If you want to match 'mast' (and the others as well) anywhere in the
text string, and also deal with uppercase vs. lowercase, you might want
to move to PRX functions. You write a regular expression (regex) that
meets your needs, and you have the PRXMATCH() function look for the
matches.
data new;
set YourData;
/* I always recommend this part for regex beginners */
if _n_=1 then do;
re = prxparse('/mast|modified|mrm|mat|quad/i');
if missing(re) then do; putlog 'ERROR: bad regex'; stop; end;
end;
if prxmatch(re,DrugProcSite) then do;
<whatever logic you wanted to insert would go here>
end;
run;
The regex says to match:
/ stat of regex
mast the exact text 'mast'
| or
modified the exact text 'modified'
| or
mrm the exact text 'mrm'
| or
mat the exact text 'mat'
| or
quad the exact text 'quad'
/i end regex AND 'i' for 'ignore case in matches'
This will match Mast and mast and MaSt, but it will also match
'mastodon' and 'mainmast' and 'stationmaster', because the
matching part 'mast' is in all of them. If you need more
precision on the match, you can do that too, but you have to
specify just what you want to match, and what you DON'T
want to match. The regex grammar has features for
handling things like 'only at the start of a word' or 'only as
an entire word' or 'only if followed by a number' or a million
other things. But you have to specify them.
Once you get quite comfortable with regexen (the official geek
plural of regex), you can use the SAS 9.1 feature of being able
to shove the regex into the PRXMATCH() function itself:
data new;
set YourData;
if prxmatch( '/mast|modified|mrm|mat|quad/i', DrugProcSite) then do;
<whatever logic you wanted to insert would go here>
end;
run;
That looks simpler, but it is doing the same thing. Just without
any error-handling if you make a mistake in the regex. And no
one starts off able to write regexen without errors.
HTH,
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330
_________________________________________________________________
Stay in touch with old friends and meet new ones with Windows Live Spaces
http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
|