|
I think it's a DOS problem, and 16K is I believe an improvement over what it
used to be.
How are the batches of files delivered to you? If they arrive in a ZIP
archive, you might look at the possibility of filtering into multiple
directories as the files are unzipped, perhap via a script.
In SAS, you should be able to code something like
FILENAME stuff "c:\lib\sub1\*.dat,c:\lib\sub1\*.dat, ... ";
Then you could still process all in one step.
On Thu, 5 Jul 2001 04:37:32 GMT, Roger Lustig <julierog@IX.NETCOM.COM>
wrote:
>There's nothing like testing...
>
>The other day I asked about a FILENAME statement that would
>run all the files in a directory/folder through a DATA step.
>Ian Whitlock and Charles Patridge helped out. I settled on
>Ian's suggestion:
> FILENAME stuff "c:\lib\*.dat";
>
>I neglected to mention that I have many, many files. 25,000 of
>them in this batch; tomorrow I get 150,000 to be processed together.
>
>Evidently, SAS plus the DOS/Win98 OS (I was on NT before; but I
>took the problem home) must step through the entire directory
>from the top in order to find the next file. The result: ever-
>increasing execution time. The first 1000 records/files took 6 seconds
>to process; the next, 11; and so on, more or less, each additional
>1000 requiring another 5 seconds or so.
>
>First attempt at a solution: modified wildcards.
>
> FILENAME stuff "c:\lib\1234*.dat";
>
>This would run about 15% of the total number of records; with
>6 or 7 of these, plus some appending, I'd get the same results.
>
>No dice. In fact, the first few thousand records ran as slowly
>as the last few had when I used the global wildcard.
>
>Second attempt: split the folder up into multiple folders. This
>worked nicely--execution time dropped from 28 min. for 25,000
>records to 7. Of course, moving all the files from place to place
>wasn't fast either...
>
>But it did produce one observation--not that that's much of a
>sample. I found that the first few moves of a few thousand records
>took a long time; but once the number of records in the big folder
>dropped to 16K or so, things sped up dramatically and immediately.
>
>Was this just a coincidence? Or is there something about DOS/Win
>file structure (FAT32) that slows down when there are more than
>16K files in a folder?
|