Date: Tue, 13 Aug 2002 16:26:53 +0100
Reply-To: Peter Crawford <peter.crawford@DB.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Crawford <peter.crawford@DB.COM>
Subject: Re: indexW() behaviour
Content-type: text/plain; charset=iso-8859-1
Ian
thank you for the input.
When you say
"I would much rather have a default of word boundaries."
I'm not quite sure, except that your example appears to
use a comma delimiter. Whereas the word boundary
expected by indexW() is blank.
I agree that this is worthy of a third parameter, like the scan() delimiters.
... but see later
"DSD type" respect by scan() for delimiters quoted in data
becomes available in v9.0 but as the new function
>SCANQ function, version 9 and later Oct 23, 2001
>
>Returns the nth "word" from a character expression, ignoring
>delimiters inside quotation marks.
see ftp://ftp.sas.com/pub/neural/functions/scanq.txt
In the parent of this unofficial description/documentation,
ftp://ftp.sas.com/pub/neural/functions.html
it declares
INDEXW has an optional third argument to specify delimiters
So, it seems we should have this improvement to indexW()
"real soon now"
Kind Regards
Peter Crawford
Datum: 13/08/2002 16:10
An: Peter Crawford/Zentrale/DeuBaExt@Zentrale
SAS-L@LISTSERV.UGA.EDU
Betreff: RE: indexW() behaviour
Nachrichtentext:
Peter,
I am not surprised. A blank is not a word, in my book, so I would not
expect it to found. On the other hand, I did find the following log rather
disturbing, albeit documented. I would much rather have a default of word
boundaries. And while I am thinking about it, SCAN should have another
parameter, so that it could behave like DSD on INFILE statements. Of
course, the default value should be to behave as it does now.
12 data _null_ ;
13
14 x = "abc,def@" ;
15 abc = indexw(x, "abc") ;
16 comma = indexw(x,",") ;
17 def = indexw(x,"def") ;
18 at = indexw(x,"@") ;
19 put _all_ ;
20 run ;
x=abc,def@ abc=0 comma=0 def=0 at=0 _ERROR_=0 _N_=1
NOTE: DATA statement used:
real time 0.05 seconds
IanWhitlock@westat.com
-----Original Message-----
From: Peter Crawford [mailto:peter.crawford@DB.COM]
Sent: Tuesday, August 13, 2002 10:35 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: indexW() behaviour
Hi there,
I want your opinion !
I just discovered a feature of indexW()
Unlike indexC() and index() with trim(), indexW()
never "finds" blanks, as in the third value for NAME below.
This is behaviour I would want, but hadn't expected !
Now I have to wonder, .....
A) should I _expect_ indexW() to "never find blank" ????
or
B) should I prepare code to test blank explicitly ????
in case a future release _upgrades_ the function
behaviour ??????????????????
(like function quote() was "upgraded" in v7
no longer does it trim() before quoting ! )
I tend to favor B) above, because it is obvious to a
programmer who is less familiar with the app./code
BUT if you would not expect indexW() to "find a blank"
then I can adopt policy A).....
..... which might have a performance benefit over a few
billion iterations!!
1048 data _null_;
1049 names = 'AB C D' ;
1050 do name = 'PQD', 'C', ' ' ;
1051 foundw = indexW( names, name );
1052 foundC = indexC( names, name );
1053 found = index( names, name );
1054 foundt = index( names, trim(name) );
1055 put name= @11 foundW= foundC= found= foundt= ;
1056 end;
1057 run;
name=PQD foundw=0 foundC=7 found=0 foundt=0
name=C foundw=5 foundC=3 found=0 foundt=5
name= foundw=0 foundC=3 found=0 foundt=3
NOTE: DATA statement used:
--
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.
This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.
--
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.