Date: Mon, 16 May 2011 14:01:39 -0400
Reply-To: Dave Brewer <david.brewer@UC.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dave Brewer <david.brewer@UC.EDU>
Subject: Re: PROC IMPORT - Tab Delimited Problem
Hi Art,
There is a hot fix for this at:
http://ftp.sas.com/techsup/download/hotfix/HF2/B25.html#40220
Thanks.
Dave
On Mon, 16 May 2011 12:23:11 -0400, Arthur Tabachneck <art297@ROGERS.COM>
wrote:
>Dave,
>
>About a month and a half ago you asked about the behavior of proc import
>when importing quoted numbers in a tab-delimited file.
>
>To refresh your (any mine or anyone's memory), if one had a delimited file
>(say c:\testsp.txt) that had, say two quoted variables (e.g., X and Y),
then
>given a file that looked like:
>
>"X" "Y"
>"1000" "1"
>"1001" "2"
>"1002" "3"
>
>using the following code:
>PROC IMPORT OUT= WORK.TESTSP
> DATAFILE= "C:\testsp.txt"
> DBMS=DLM REPLACE;
> GETNAMES=YES;
> DATAROW=2;
>RUN;
>
>SAS would return the following SAS dataset:
>
>X Y
>01OCT00:00:00:00 1
>01OCT01:00:00:00 2
>01OCT02:00:00:00 3
>
>You had actually asked about a tab delimited file, but the same results
>appear to occur with any type of delimited file.
>
>I haven't heard from any birdies, as yet, but did come across something
that
>might explain the behavior. If the field(s) is(are) quoted, then proc
>import appears to apply the anydtdteW. informat to test whether the column
>contains a date. If proc import finds all records that pass that test,
then
>the column is treated as a date. Interestingly, if it finds all fields
meet
>the test, except one which has a missing value, then it treats the field
as
>the number you really wanted. Conversely, if it finds all fields meet the
>test, except one which doesn't but isn't missing, then it treats the field
>as a character field that contains the numbers.
>
>Some of the quoted numbers that anydtdteW. will consider as a date can be
>seen by running:
>
>data test;
> format date_to_test date9.;
> do i= 1 to 20000;
> date_to_test = input(quote(strip(i)), ?? anydtdte21.);
> output;
> end;
>run;
>
>Art
>-------
>On Fri, 1 Apr 2011 12:03:44 -0400, Dave Brewer <david.brewer@UC.EDU>
wrote:
>
>>Art,
>>
>>Perhaps the "Code Doctors" could shed some light on this :=)
>>
>>Thanks for your help.
>>Dave
>>
>>On Fri, 1 Apr 2011 11:02:17 -0400, Arthur Tabachneck <art297@ROGERS.COM>
>>wrote:
>>
>>>Dave,
>>>
>>>I've never confronted this particular situation before and find the
>>>particular behavior perplexing.
>>>
>>>A fascinatingly easy way around it, but which makes absolutely no sense
>>>(other than it works) is to make a simple modification to your first
data
>>>record.
>>>
>>>If it contains "301001" rather than "1001", all of the fields read in as
>>>numbers.
>>>
>>>Then, you would only have to modify the first record to remove the
>>>extraneous "30".
>>>
>>>But, honestly, I have absolutely no idea why it behaves that way.
>>>
>>>Art
>>>-------
>>>On Fri, 1 Apr 2011 10:29:06 -0400, Dave Brewer <david.brewer@UC.EDU>
>>wrote:
>>>
>>>>Art,
>>>>
>>>>This is exactly what I did...but why does SAS read that column as
>>anydate?
>>>>
>>>>Dave
>>>>
>>>>On Fri, 1 Apr 2011 09:09:58 -0400, Arthur Tabachneck
<art297@ROGERS.COM>
>>>>wrote:
>>>>
>>>>>Joe,
>>>>>
>>>>>Another option would be to first rewrite the file simply getting rid
of
>>>>the
>>>>>first two quotes on each line. If the field isn't enclosed in quotes
>>proc
>>>>>import will read it correctly as a number.
>>>>>
>>>>>Art
>>>>>-------
>>>>>On Fri, 1 Apr 2011 07:58:48 -0500, Joe Matise <snoopy369@GMAIL.COM>
>>wrote:
>>>>>
>>>>>>Does DBDSOPTS->DBSASTYPE work for importing delimited files (like it
>>does
>>>>>>for excel)? That would allow you to individually assign columns.
I've
>>>>not
>>>>>>tried it (and am not sure it would work) but if it does this would be
>>>>your
>>>>>>meal ticket.
>>>>>>
>>>>>>Otherwise another option in addition to _null_'s suggestion below is
to
>>>>>read
>>>>>>in one (or a few) line(s) of the file with proc import, then use that
>>to
>>>>>>generate code to read in the rest of the file.
>>>>>>
>>>>>>-Joe
>>>>>>
>>>>>>On Fri, Apr 1, 2011 at 7:45 AM, Data _null_; <iebupdte@gmail.com>
>>wrote:
>>>>>>
>>>>>>> On Fri, Apr 1, 2011 at 7:35 AM, Dave Brewer <david.brewer@uc.edu>
>>>>wrote:
>>>>>>> > I realize I could manipulate the erroneous value, but I am still
>>>>>puzzled
>>>>>>> > as to why SAS is doing this.
>>>>>>> There is a parameter call QUESSINGROWS. Programmed guessing don't
>>>>always
>>>>>>> work.
>>>>>>>
>>>>>>> >
>>>>>>> > I am confused on what you are trying to say in your third
solution
>>>>with
>>>>>>> > the input @ and PUT. Would you elaborate?
>>>>>>>
>>>>>>> Read the file and make a new one (TEMP) with a character added to
the
>>>>>>> Field.
>>>>>>>
>>>>>>> filename FT15F001 temp;
>>>>>>> filename FT16F001 temp;
>>>>>>> data _null_;
>>>>>>> infile FT15F001;
>>>>>>> file FT16F001;
>>>>>>> input @;
>>>>>>> if _n_ gt 1 then put 'S' _infile_;
>>>>>>> else put _infile_;
>>>>>>> parmcards;
>>>>>>> StudyNumber
>>>>>>> 1001
>>>>>>> 0110
>>>>>>> 1011
>>>>>>> ;;;;
>>>>>>> run;
>>>>>>> proc import datafile=FT16F001 dbms=dlm out=work.TXT_ACOGsA replace;
>>>>>>> delimiter='09'x; getnames=yes; guessingrows=32767;
>>>>>>> run;
>>>>>>> proc print;
>>>>>>> run;
>>>>>>>
>>>>>>> > Thanks again for your concern.
>>>>>>> > Dave
>>>>>>> >
>>>>>>> > On Fri, 1 Apr 2011 07:22:30 -0500, Data _null_;
>><iebupdte@GMAIL.COM>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> >>I wonder if modifying
>>>>>>> >>DATESTYLE= MDY | MYD | YMD | YDM | DMY | DYM | LOCALE
>>>>>>> >>would have a desirable effect on PROC IMPORT.
>>>>>>> >>
>>>>>>> >>You could also "extract" the original value 1001 from the SAS
date
>>>>>>> >>time variable.
>>>>>>> >>SN = put(StudyNumber,?????);
>>>>>>> >>rename SN=StudyNumber;
>>>>>>> >>drop StudyNumber;
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>I might consider modifying the input files with a data step. All
>>>>>>> >>files could be modified with a single data step.
>>>>>>> >>
>>>>>>> >>input@;
>>>>>>> >>put 'S' _infile_;
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>On Fri, Apr 1, 2011 at 7:04 AM, Dave Brewer <david.brewer@uc.edu>
>>>>>wrote:
>>>>>>> >>> No.
>>>>>>> >>>
>>>>>>> >>> The only variable in common is the StudyNumber.
>>>>>>> >>>
>>>>>>> >>> Dave
>>>>>>> >>>
>>>>>>> >>> On Fri, 1 Apr 2011 07:02:24 -0500, Data _null_;
>>>><iebupdte@GMAIL.COM>
>>>>>>> > wrote:
>>>>>>> >>>
>>>>>>> >>>>On Fri, Apr 1, 2011 at 6:48 AM, Dave <david.brewer@uc.edu>
wrote:
>>>>>>> >>>>> I have several tab-delimited files each containing about 100
>>>>>>> > variables;
>>>>>>> >>>>
>>>>>>> >>>>Could all the files be read with the same input statement?
>>>>>>> >>>
>>>>>>> >
>>>>>>>
|