LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (August 2002, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 13 Aug 2002 16:26:53 +0100
Reply-To:     Peter Crawford <peter.crawford@DB.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Crawford <peter.crawford@DB.COM>
Subject:      Re: indexW() behaviour
Comments: To: Ian Whitlock <WHITLOI1@WESTAT.com>
Content-type: text/plain; charset=iso-8859-1

Ian thank you for the input. When you say "I would much rather have a default of word boundaries." I'm not quite sure, except that your example appears to use a comma delimiter. Whereas the word boundary expected by indexW() is blank. I agree that this is worthy of a third parameter, like the scan() delimiters. ... but see later

"DSD type" respect by scan() for delimiters quoted in data becomes available in v9.0 but as the new function >SCANQ function, version 9 and later Oct 23, 2001 > >Returns the nth "word" from a character expression, ignoring >delimiters inside quotation marks. see ftp://ftp.sas.com/pub/neural/functions/scanq.txt

In the parent of this unofficial description/documentation, ftp://ftp.sas.com/pub/neural/functions.html it declares INDEXW has an optional third argument to specify delimiters

So, it seems we should have this improvement to indexW() "real soon now"

Kind Regards Peter Crawford

Datum: 13/08/2002 16:10 An: Peter Crawford/Zentrale/DeuBaExt@Zentrale SAS-L@LISTSERV.UGA.EDU

Betreff: RE: indexW() behaviour Nachrichtentext:

Peter,

I am not surprised. A blank is not a word, in my book, so I would not expect it to found. On the other hand, I did find the following log rather disturbing, albeit documented. I would much rather have a default of word boundaries. And while I am thinking about it, SCAN should have another parameter, so that it could behave like DSD on INFILE statements. Of course, the default value should be to behave as it does now.

12 data _null_ ; 13 14 x = "abc,def@" ; 15 abc = indexw(x, "abc") ; 16 comma = indexw(x,",") ; 17 def = indexw(x,"def") ; 18 at = indexw(x,"@") ; 19 put _all_ ; 20 run ;

x=abc,def@ abc=0 comma=0 def=0 at=0 _ERROR_=0 _N_=1 NOTE: DATA statement used: real time 0.05 seconds

IanWhitlock@westat.com

-----Original Message----- From: Peter Crawford [mailto:peter.crawford@DB.COM] Sent: Tuesday, August 13, 2002 10:35 AM To: SAS-L@LISTSERV.UGA.EDU Subject: indexW() behaviour

Hi there, I want your opinion !

I just discovered a feature of indexW() Unlike indexC() and index() with trim(), indexW() never "finds" blanks, as in the third value for NAME below. This is behaviour I would want, but hadn't expected ! Now I have to wonder, ..... A) should I _expect_ indexW() to "never find blank" ???? or B) should I prepare code to test blank explicitly ???? in case a future release _upgrades_ the function behaviour ?????????????????? (like function quote() was "upgraded" in v7 no longer does it trim() before quoting ! )

I tend to favor B) above, because it is obvious to a programmer who is less familiar with the app./code

BUT if you would not expect indexW() to "find a blank" then I can adopt policy A)..... ..... which might have a performance benefit over a few billion iterations!!

1048 data _null_; 1049 names = 'AB C D' ; 1050 do name = 'PQD', 'C', ' ' ; 1051 foundw = indexW( names, name ); 1052 foundC = indexC( names, name ); 1053 found = index( names, name ); 1054 foundt = index( names, trim(name) ); 1055 put name= @11 foundW= foundC= found= foundt= ; 1056 end; 1057 run;

name=PQD foundw=0 foundC=7 found=0 foundt=0 name=C foundw=5 foundC=3 found=0 foundt=5 name= foundw=0 foundC=3 found=0 foundt=3 NOTE: DATA statement used:

--

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

--

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.


Back to: Top of message | Previous page | Main SAS-L page