Date: Tue, 18 May 2004 06:35:09 -0400
Reply-To: ben.powell@CLA.CO.UK
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: ben.powell@CLA.CO.UK
Subject: Re: Import CSV file, variable field numbers
There are two weaknesses in the code I'm currently looking at (one
critical). First it can't handle speech mark delimeters for text, although
this is no biggy. The more major problem is that if there are more than two
rows of data for each ISBN then it overwrites the second row with the last
so that only two are captured. I realise there was a lack of understanding
on my part vis a vis the array handling but having tested the original code
this also could not handle more than two rows of supplementary data per
ISBN. Any ideas?
Ben.
On Tue, 18 May 2004 05:24:11 -0400, ben.powell@CLA.CO.UK wrote:
>I ended up adapting the code in this mail from Randy Rak:
>
><mail>
>B., the code below worked on data that I generated to match the concept in
>your post (Details may vary.). R.
>
>--- begin code ---
>data data; length rec $ 200; input; rec=_infile_; if rec=' ' then delete;
>cards;
>ISBN: 123-4567-89012
>
>abc,123,!@#
>
>ISBN: 012-345-67890
>
>xyz,890,*()
>
>more,h.i.j
>
>ISBN: 098-7654-32109
>; run;
>
>title1 'data'; proc print noobs; run;
>
>data pars(drop=rec i j); set data end=eod; retain isbn det1-det6 j; length
>isbn $ 14 det1-det6 $ 5; array det(6) det1-det6; if rec=:'ISBN' then do; if
>_n_>1 then output; isbn=scan(rec,2,' '); do j=1 to 6; det(j)=' '; end; j=0;
>end; else do; do i=1 to 3; det(i+j)=scan(rec,i,','); end; j=3; end; if eod
>then output; run;
>
>title1 'pars'; proc print; run;
>--- end code ---
></mail>
>
>My slight adaptation was as follows:
>
><adaptation>
>data data;
>infile 'P:\SAS\PROJECTS\sasarevisedbooks1.csv';
>length rec $200.;
>input;
>rec=_infile_;
>if rec='' then delete;
>run;
>
>data pars(drop=rec i j);
>set data end=eod;
>retain isbn det1-det30 j;
>length isbn $ 20 det1-det30 $ 50;
>array det(30) det1-det30;
>if rec=:'ISBN' then do;
> if _n_>1 then output;
> isbn=compress((scan(rec,3,' ')),',');
> do j=1 to 30; det(j)=' ';
> end;
>j=0;
>end;
>else do;
> do i=1 to 3;
> det(i+j)=scan(rec,i,',');
> end; j=3;
>end;
>if eod then output;
>run;
></adaptation>
>
>Works a treat!
>
>Ben.
>
>
>On Mon, 17 May 2004 10:59:30 -0400, Chang Y. Chung
><chang_y_chung@HOTMAIL.COM> wrote:
>
>>On Mon, 17 May 2004 06:37:04 -0400, ben.powell@CLA.CO.UK wrote:
>>
>>>Dear SAS-L
>>>
>>>I need to import a comma delimited file where there could be anything
>>>between 1 and 5 additional rows of data after the original item. The data
>>>is title information - a catalogue extract - and each new record
>>>begins "ISBN: ". Following this there is a blank line then DETAIL1-3,
then
>>>another blank line and if there are more details then DETAIL4-6, or
>>>otherwise there will be a new record again beginning "ISBN: ".
>>>
>>>I've used this sort of syntax in the past but how can I get it to check
>>>whether to create a new record or additional detail data?
>>>
>>><sascode>
>>>data test;
>>>retain ISBN det1 det2 det3 det4 det5 det6 email web;
>>>infile "&path\Docs\CSV\_list.csv" dlm = ',' dsd missover firstobs = 1;
>>>informat det1 8. det2 $60. det3 $30. email web $50. det4 $100. det5 det6
>>>$20.;
>>>input det1 det2 $ det3 $ det4 $ det5 $
>>>/ j1 $ j2 $ j3 $ det6 $ email $
>>>#3 j4 $ j5 $ j6 $ j7 $ web $;
>>>drop j1 j2 j3 j4 j5 j6 j7;
>>>run;
>>></sascode>
>>
>>Hi, Ben,
>>
>>This is tricky. But you basically want to:
>>
>>(1) read one line at a time (first read the line to decide which type of
>>line it is (first line with new ISBN, det1-3, det4-6, or blank) and then
>>re-read the line with appropriate variables;
>>(2) retain everything
>>(3) output only the retained information when i: a new ISBN line is
>>encountered or ii: no more data lines
>>(4) once output then clear all the retained variables and start over at
>>(1).
>>
>>It may be cleaner to program if you go over the data twice (with two data
>>steps), the first time creating a flag variable indicating the last row of
>>the given ISBN(and also eliminating the blank lines). The second time, you
>>just output and clean up retained variable only when the flag is set. HTH.
>>
>>Cheers,
>>Chang
|