Date: Sat, 12 Jul 2003 11:39:41 +0100
Reply-To: Crawford <PeterDOTCrawfordATblueyonder.co.uk@Peter.BITNET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Crawford <PeterDOTCrawfordATblueyonder.co.uk@Peter.BITNET>
Organization: blueyonder (post doesn't reflect views of blueyonder)
Subject: Re: a combination of two characters as one delimiter
Adam,
Harry.DroogendykatCIBC.COM is offering below, the
most practical solution (replace that composite 2byte
delimiter with a tab) so I can add little to deal immediately
with that data structure. May I suggest you speak to
your data provider. There is a common standard for
handling delimited data. It's such an old standard I
haven't a reference. This standard solves the difficulty
of having a delimiter embedded in data by quoting
that data cell. The standard also has rules for quote
marks in cells to ensure no confusion is caused.
Quote marks get repeated. Here is an example to help
explain :
2 Raw data cells (for clarity here, each on its own line):
"don't tell me that", he exclaimed!
"99 was a good year for you, I hope.
Delimiter:
,
In CSV format:
"""don't tell me that"", he exclaimed!","""99 was a good year for you, I
hope."
The only limitation I see, is that it doesn't allow the
quote mark as a delimiter, but that seems minor-irrelevant.
If you save an excel sheet in CSV format, not only
character cells containing a comma will be quoted.
Values that contain a delimiter, for example, $1,234
an amount expressed with dollar formatting, will be
saved quoted, because the delimiter is a comma.
If you select / as a delimiter, dates in m/d/y become
quoted.
The standard is fully respected and supported by
base SAS infile option DSD along with DLM= option
which allows 1 or more alternative delimiters to be defined.
If that date with quotes is input in a datastep with
infile options DSD and dlm='/' then even before the
informat gets to work on the string, the quoting for
delimiters is removed.
So, I hope you might be able to explain to your
data provider that embedded delimiters are more
easily managed within the old CSV standards that,
it appears, remain to be learned.
Unless someone can explain why that old standard
is no longer valid or optimal ?????????????????? !!!
Regards
Peter Crawford
available for SAS consultancy contracts
"Droogendyk, Harry" <Harry.Droogendyk@CIBC.COM> wrote in message
news:F0161D3F7AC5D411A5BE009027E774D60B917EFB@gemmrd-scc013eu.gem.cibc.com...
> Adam:
>
> SAS will allow you to play with the input buffer before you actually
> populate the variables in the INPUT statement. Using TRANWRD, change the
> two character delimiter to a character that will not occur in the data,
e.g.
> tab character or other non printable character. Specify that character as
> your delimiter:
>
> options nocenter;
> data a ;
> infile cards dlm='09'x dsd;
> array final(3) $8;
> input @;
> _infile_ = tranwrd(_infile_,'[~','09'x);
> input final(*);
> cards;
> value1[~value2[~val~ue3
> run;
>
> proc print;
> run;
>
>
>
> -----Original Message-----
> From: Adam [mailto:yadong_cui@YAHOO.COM]
> Sent: July 11, 2003 3:20 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: a combination of two characters as one
> delimiter
>
> How to read delimited txt files into SAS; the txt files
used
> a
> combination of two rare characters as delimiters such as
> '[~'.
>
> infile txtfile dlm='[~'; will treat either [ or ~ as a
> delimiter,
> not the combination.
>
> The reason for this combination is that either of them may
> be shown in
> some fields, so if I pick up either one as a delimiter,
the
> txt file
> may not be read into SAS correctly.
>
> Thanks.
|