LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2005, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 9 May 2005 17:14:46 -0400
Reply-To:   Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Subject:   Re: Processing multiple CSV files from a directory
Comments:   To: Sa Polo <solouga2@REDIFFMAIL.COM>

On Mon, 9 May 2005 16:19:01 -0000, sa polo <solouga2@REDIFFMAIL.COM> wrote:

>Hi All, I am attempting to write a program to read CSv files from several sub-directories and process them by adding a date field and part of the file name to the new file which is again in CSV format.

Directory: c:\ Sub-Dirs : 20010101, 20010102 ................. CSV Files in each directory abc_a.csv, abc_ab.csv abc_se.csv abc_po.csv abc_**** ......... Extract the data from each file merge the date(obtained from the sub- directory) in date9. format and whatever appears after the _(underscore ) in the file name . example File abc_a in directory 20010101: 1,se,oo,poll

output 1,se,oo,poll,01JAN2001,a

The program should read all the sub-directories in the directory in this case c:\ and process all files and write them to a separate location on say another drive or directory in the same csv format .

All assistance is much appreciated as usual.

Sa

I think we've seen this question before, so a better option may be to read the archives, than the following ....

This from memory gets close (*** beware untested code *****)

%let pathRoot = c:\ ; %let pathMask = &pathroot.200 ; * filter paths with this prefix ; %let fileMask = abc_ ; * only filenames beginning like this ;

* assuming all files have same column lengths/types structure ; %macro the_data( len_defn ) ; length &len_defn ; input %scan( &len_defn, 1 ) /* first column name */ -- %scan( &len_defn,-2 ) /* name before the last length */ %mend;

* get filenames & paths from command pipe ; filename collect pipe "dir /b/s ""&pathRoot"" " lrecl= 1000 ;

* get data ; filename dum1 '.' ; * dummy fileref for filevar infile ; data results( drop= filen ) ; length pathName filen $1000 file $60 ; retain filen ; infile collect ; input ; pathName = _infile_ ; *I load the path and filename from the pipe, like that to avoid multiple embedded blank input problems ;

if upcase(pathname) =: "%upcase(&pathMask)" ; filen = scan( pathName, -1, '\' ); if filen =: "&fileMask" ;

dir_date = input( substr( pathname, 4,8), ?? yymmdd8. ) ; if dir_date =. then delete ; * must be valid date!; * avoid paths that do not start with a valid date ; format dirDate date9. ;

file = substr( filen, 1+ index( filen, '_' ));

* now point into file to be read ; infile dum1 filevar= pathname lrecl= 30000 DSD end= eofD ; input ; * drops the normal column-names heading-line ; * not needed if there is no heading line ; do while( not eofD ); %the_data( id 8 cat1 $2 cat2 $2 poll $8 ) ; output ; end; * reset the end-of-data flag ready to read the next file ; eofD= 0; run;

* for unix, the command to pipe would be a little different and the parsing would use / not \ Little else would need to change ;

Does that seem to cover the customisation needed ?

Peter Crawford


Back to: Top of message | Previous page | Main SAS-L page