|
Hi,
Say I have an Excel file c:\temp\test.xls containing:
TEST (Row 1, the variable name)
ABC
ABC<Alt-Enter>DEF
ABC<Alt-Enter>DEF<Alt-Enter>GHI
So, in Excel, it looks like (pretend the horrid ASCII "art" are gridlines)
TEST
----
ABC
----
ABC
DEF
----
ABC
DEF
GHI
----
If I import the XLS file like so:
proc import
file="c:\temp\test.xls"
out=test1
dbms=excel
replace;
getnames=yes;
run;
proc print;
format test $hex.;
run;
I get three observations. The data still has the embedded Alt-Enter (hex
code 0A). Which is what I want.
However, if I save the Excel worksheet as CSV, my CSV file looks like:
TEST
ABC
"ABC
DEF"
"ABC
DEF
GHI"
And if I import the CSV like so:
proc import
file="c:\temp\test.csv"
out=test
dbms=csv
replace;
getnames=yes;
guessingrows=32767;
run;
proc print;
run;
I get 6 observations, which mirror the CSV file above.
Is there a way to import the CSV file such that it mimics the import of the
XLS file? I don't care if PROC IMPORT can't do it; I'm happy to write a
"manual" data step.
Perhaps with some creative use of recfm=N so that "0A"x becomes just another
character, instead of an end-of-record indicator?
I guess the logic would be:
* if this data is not quoted, read until end-of-record
* if this data is quoted, read until terminating quotation mark, ignoring
any end-of-record characters.
Free beer if it also performs well ;-)
Cheers,
Scott
|