LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (July 2001, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 5 Jul 2001 16:31:30 -0400
Reply-To:   Howard Schreier <Howard_Schreier@ITA.DOC.GOV>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Howard Schreier <Howard_Schreier@ITA.DOC.GOV>
Subject:   Re: More about one record per file

I think it's a DOS problem, and 16K is I believe an improvement over what it used to be.

How are the batches of files delivered to you? If they arrive in a ZIP archive, you might look at the possibility of filtering into multiple directories as the files are unzipped, perhap via a script.

In SAS, you should be able to code something like

FILENAME stuff "c:\lib\sub1\*.dat,c:\lib\sub1\*.dat, ... ";

Then you could still process all in one step.

On Thu, 5 Jul 2001 04:37:32 GMT, Roger Lustig <julierog@IX.NETCOM.COM> wrote:

>There's nothing like testing... > >The other day I asked about a FILENAME statement that would >run all the files in a directory/folder through a DATA step. >Ian Whitlock and Charles Patridge helped out. I settled on >Ian's suggestion: > FILENAME stuff "c:\lib\*.dat"; > >I neglected to mention that I have many, many files. 25,000 of >them in this batch; tomorrow I get 150,000 to be processed together. > >Evidently, SAS plus the DOS/Win98 OS (I was on NT before; but I >took the problem home) must step through the entire directory >from the top in order to find the next file. The result: ever- >increasing execution time. The first 1000 records/files took 6 seconds >to process; the next, 11; and so on, more or less, each additional >1000 requiring another 5 seconds or so. > >First attempt at a solution: modified wildcards. > > FILENAME stuff "c:\lib\1234*.dat"; > >This would run about 15% of the total number of records; with >6 or 7 of these, plus some appending, I'd get the same results. > >No dice. In fact, the first few thousand records ran as slowly >as the last few had when I used the global wildcard. > >Second attempt: split the folder up into multiple folders. This >worked nicely--execution time dropped from 28 min. for 25,000 >records to 7. Of course, moving all the files from place to place >wasn't fast either... > >But it did produce one observation--not that that's much of a >sample. I found that the first few moves of a few thousand records >took a long time; but once the number of records in the big folder >dropped to 16K or so, things sped up dramatically and immediately. > >Was this just a coincidence? Or is there something about DOS/Win >file structure (FAT32) that slows down when there are more than >16K files in a folder?


Back to: Top of message | Previous page | Main SAS-L page