|
On Wed, 8 May 1996, Howard_Schreier@ITA.DOC.GOV wrote:
>On Wed, 8 May 1996 00:06:21 +0100, John Whittington <johnw@MAG-NET.CO.UK>
>wrote (in part):
>> Of course, one solution would have been for SAS to reserve this 'odd
>> behavior', as Tim regards it, only for the situation of the deafult
>> (space) delimiter.
>
>But now of course it's best not to change anything which will break
>anybody's existing code.
Agreed, as a principle - but, as I said in my previous message, I seriously
doubt that there is any significant amount of code around which would be
broken - since I cannot believe that many people use non-significant
multiple delimiters when the delimiter is a 'visible' character - i.e. I
doubt whether many people deal with data which looks like:
1#2#3######4##Apple###"Quoted"####6
or even
1,2,3,,,,,,4,,Apple,,,"Quoted",,,,6
(where multiple delimiters are 'non-significant')
>My complaint is that the DSD option changes two different behaviors of
>INFILE: (1) makes consecutive delimiters significant and (2) enables quoting
>of character values, so that such values can contain instances of the
>delimiter character. I regularly get files which are blank delimited with
>non-significant multiple blanks and which also contain quoted strings. In
>other words, I need to utilize #2 but not #1, and I can't.
>If SAS would provide separate options for these behaviors, I for one would
>accept any combination of defaults, no matter how strange.
Yes, I agree. I get data like that, too. There is, as far as I know, no
very simple solution. If the character variables are of fixed length, then
one can obviously read them in with a fixed length format (which copes with
embedded 'delimiters'), and then get rid of any quotes with something like:
array x _character_ ;
do over x ; x = dequote(x) ; end ;
... but that approach won't work if the variables are (as they usually are!)
of varying length. My usual approach is to pre-process to eliminate mutiple
delimiters between variables, and then to use DSD, but I agree this is a
pain - as is the alternative, to read into SAS character by character and
then re-build the variables.
To maintain consistency with the existing state of affairs, I presume that
they would have to *add* a new option, which simply dealt with the quoted
strings, leaving DSD still doing both things. To complete the set, I
suppsoe one ought to have the third option, as well - to treat multiple
delimiters as significant, but NOT to handle quoted strings.
John
-----------------------------------------------------------
Dr John Whittington, Voice: +44 1296 730225
Mediscience Services Fax: +44 1296 738893
Twyford Manor, Twyford, E-mail: johnw@mag-net.co.uk
Buckingham MK18 4EL, UK CompuServe: 100517,3677
-----------------------------------------------------------
|