Date: Thu, 27 May 2010 14:13:02 -0500
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Matthew Pettis <matt.pettis@THOMSONREUTERS.COM>
Subject: Re: Redirect SAS Log (temporarily) to a Windows '/dev/null'
Content-Type: text/plain; charset="iso-8859-1"
A couple of constraints for me...
1. I know the first 3 cols are always the same and in the same order, the rest of the columns are not, and sometimes a column doesn't appear in certain files. They are all numeric. So I have a some information about the nature of the columns, but not all.
2. I want column names out of the gate, and usually I want 5 cols out of ~100. With reading in a file straight, I'd have to find the column, remove the others. PROC IMPORT is somewhat helpful here.
3. But PROC IMPORT is not really that fast. I assume it is due to I/O. What I do know is that I can reduce a 45min job to under 5min by doing this pre-processing and making SAS do only 1 read from pipe and 1 read from disk. That's big.
SAS is pretty good at text manipulation, but there are lighter and faster pre-processors that can make for a significant speedup in overall processing time.
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Tom Abernathy
Sent: Thursday, May 27, 2010 12:10 PM
Subject: Re: Redirect SAS Log (temporarily) to a Windows '/dev/null' equivalent?
SAS is very good at text manipulation, why not just pass it through SAS?
One of my standard tools for looking at an unknown delimited file is to read
it into character variables in SAS and see what is there.
infile 'what-is-this.csv' dsd truncover;
length x1-x50 $200;
On Thu, 27 May 2010 09:13:29 -0500, Matthew Pettis
>My 6,000 files aren't *quite* guaranteed to have exactly the same layout,
but I'm thinking of coercing them into one by reading the file and piping it
through an appropriate 'logparser' query (think awk restricted to very
rectangular data for Windows), and then providing a single datastep (in
fact, I'm going to make it a view so as to reduce storage and I/O in my
>From: Søren Lassen [mailto:s.lassen@POST.TELE.DK]
>Sent: Wednesday, May 26, 2010 11:57 PM
>To: SAS-L@LISTSERV.UGA.EDU; Pettis, Matthew (Legal)
>Subject: Re: Redirect SAS Log (temporarily) to a Windows '/dev/null'
>You can route output in Windows to nowhere by using the 'NUL' physical
>PROC PRINTTO log='NUL';run;
>You can turn off the dump of the datastep by using
>as the 'data' step, as you call it, is a very real datastep, generated
>by PROC IMPORT and then submitted. If your 6000 .csv files all have
>the same format, you can also import one of them, then recall the datastep
>code (the recall button, typically F4, in the program editor) and
>edit it to read all of the files (use the FILEVAR and EOV options for
>the INFILE statement) in one fell swoop in a single datastep. Probably a
>lot faster than doing 6000 PROC IMPORTs.
>On Wed, 26 May 2010 21:01:50 -0500, Matthew Pettis
>>I'm trying to import ~6000 .csv files with PROC IMPORT, and the .csv
>>files are very wide. The problem is that my SAS LOG fills up very
>>quickly in interavtive mode, and in batch mode, takes up a lot of space
>>in the LOG file that I immediately delete because it's so large.
>>I'd like a way to redirect the SAS LOG to the equivalent of /dev/null
>>for the duration of the imports (which is encapsulated in a macro).
>>Right now, I'm temporarily diverting the LOG to an external .log file
>>with a PROC PRINTTO statement, and then immediately deleting it. I'd
>>like to avoid the I/O in the first place if possible.
>>On a very related note, since these are wide .csv files, the PROC IMPORT
>>statement generates very large LOG entries with the underlying 'data'
>>step that does the import in the LOG. Is there *any* way to turn off
>>the dump of the 'data' step to the LOG?
>>Thanks in advance,