LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 1996, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 8 May 1996 15:45:12 +0100
Reply-To:   John Whittington <johnw@MAG-NET.CO.UK>
Sender:   "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:   John Whittington <johnw@MAG-NET.CO.UK>
Subject:   Re: comma delimited file
Comments:   To: Howard_Schreier@ITA.DOC.GOV

On Wed, 8 May 1996, Howard_Schreier@ITA.DOC.GOV wrote:

>On Wed, 8 May 1996 00:06:21 +0100, John Whittington <johnw@MAG-NET.CO.UK> >wrote (in part): >> Of course, one solution would have been for SAS to reserve this 'odd >> behavior', as Tim regards it, only for the situation of the deafult >> (space) delimiter. > >But now of course it's best not to change anything which will break >anybody's existing code.

Agreed, as a principle - but, as I said in my previous message, I seriously doubt that there is any significant amount of code around which would be broken - since I cannot believe that many people use non-significant multiple delimiters when the delimiter is a 'visible' character - i.e. I doubt whether many people deal with data which looks like:

1#2#3######4##Apple###"Quoted"####6 or even 1,2,3,,,,,,4,,Apple,,,"Quoted",,,,6 (where multiple delimiters are 'non-significant')

>My complaint is that the DSD option changes two different behaviors of >INFILE: (1) makes consecutive delimiters significant and (2) enables quoting >of character values, so that such values can contain instances of the >delimiter character. I regularly get files which are blank delimited with >non-significant multiple blanks and which also contain quoted strings. In >other words, I need to utilize #2 but not #1, and I can't. >If SAS would provide separate options for these behaviors, I for one would >accept any combination of defaults, no matter how strange.

Yes, I agree. I get data like that, too. There is, as far as I know, no very simple solution. If the character variables are of fixed length, then one can obviously read them in with a fixed length format (which copes with embedded 'delimiters'), and then get rid of any quotes with something like:

array x _character_ ; do over x ; x = dequote(x) ; end ;

... but that approach won't work if the variables are (as they usually are!) of varying length. My usual approach is to pre-process to eliminate mutiple delimiters between variables, and then to use DSD, but I agree this is a pain - as is the alternative, to read into SAS character by character and then re-build the variables.

To maintain consistency with the existing state of affairs, I presume that they would have to *add* a new option, which simply dealt with the quoted strings, leaving DSD still doing both things. To complete the set, I suppsoe one ought to have the third option, as well - to treat multiple delimiters as significant, but NOT to handle quoted strings.

John

----------------------------------------------------------- Dr John Whittington, Voice: +44 1296 730225 Mediscience Services Fax: +44 1296 738893 Twyford Manor, Twyford, E-mail: johnw@mag-net.co.uk Buckingham MK18 4EL, UK CompuServe: 100517,3677 -----------------------------------------------------------


Back to: Top of message | Previous page | Main SAS-L page