Date: Wed, 19 May 2004 10:48:16 -0700
Reply-To: "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Subject: Re: Sort by numeric suffix ?
Richard - Unfortunately I'm stuck in v8.2. I'm looking forward to moving
beyond RXMATCH and RXPARSE. Perhaps you should repost this thread as a PERL
regular expression question. I recently read a good introductory paper by
Ron Cody: http://www.nesug.org/Proceedings/nesug03/bt/bt002.pdf
but I can't play in V9 yet, so I'm no help here.
My program requires a prefix (which may be a blank) and a numeric suffix -
as specified by the problem statement ;-) One would need to test the
"position" variable to accommodate values with only a prefix or suffix.
Your use of the verify function is a good alternative to my compress/scan
method.
Regards-
Paul Choate
DDS Data Extraction
(916) 654-2160
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Richard
A. DeVenezia
Sent: Tuesday, May 18, 2004 9:29 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Sort by numeric suffix ?
Choate, Paul@DDS wrote:
> Richard -
>
> My last shot - this works also where the prefix ends in a blank.
>
> data foo;
> do i = 1 to 150;
> name = 'x' || left(put(i,4.-L));
> output;
> name = 'x_' || left(put(i,4.-L));
> output;
> end;
> run;
>
> data newfoo;
> set foo;
> position = index(name,scan(name,-1,compress(name,'0123456789')));
> prefix=substr(name,1,position-1)||"_"; ** needed for prefix
> sort;
> suffix=input(substr(name,position,vlength(name)-position+1),8.);
> run;
> proc sort data=newfoo out=newfoo(keep=name);
> by prefix suffix;
> run;
I haven't decoded your answer yet Paul... In the meantime, here is two
somethings I came up with.
1. foundation is verify (left (reverse(name)), '0123456789');
2. foundation is prxparse ('/(.*?)(\d*?)\s*?$/'); * v9 only;
- important things here, ? is non-greedy specifier, and \s*?$ is required to
match (\d*?) properly at the end. Why ? The trailing spaces in sas
character variables are 'delivered' to the perl subsystem. This delivery
confounds the initial expectation one might have...
data _null_;
length name $32;
name = 'ab123456_13';
link rip;
name = '12345';
link rip;
name = '123456a';
link rip;
name = 'foobar';
link rip;
name = '1';
link rip;
name = 'a';
link rip;
name = '';
link rip;
name = 'abcdefghijklmnopqrstuvwxyzabcdef';
link rip;
name = '00000000000000000000000000000123';
link rip;
stop;
rip:
length prefix $32 suffix 8;
namer = left (reverse(name));
p = verify (namer, '0123456789');
if p then do;
pL = length(name)-p+1;
sP = length(name)-p+2;
if p > length(name)
then prefix = '';
else prefix = substr (name,1,pL);
if sP > length(name)
then suffix = .;
else suffix = input (substr (name,sP), 32.);
end;
else do;
prefix = '';
suffix = input (name, 32.);
end;
put / name= prefix= suffix= ;
retain rx .;
if rx = . then
rx = prxparse ('/(.*?)(\d*?)\s*?$/');
if prxmatch (rx, name) then do;
pre = prxposn (rx, 1, name);
suf = prxposn (rx, 2, name);
prefix = pre;
suffix = input (suf,32.);
end;
else do;
prefix = name;
suffix = .;
end;
put name= prefix= suffix= ;
return;
run;
--
Richard A. DeVenezia
http://www.devenezia.com/downloads/sas/samples