LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2010, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 27 May 2010 12:36:49 -0700
Reply-To:     "Choate, Paul@DDS" <Paul.Choate@DDS.CA.GOV>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Choate, Paul@DDS" <Paul.Choate@DDS.CA.GOV>
Subject:      Re: Redirect SAS Log (temporarily) to a Windows '/dev/null'
              equivalent?
Comments: To: "matt.pettis@THOMSONREUTERS.COM" <matt.pettis@THOMSONREUTERS.COM>
In-Reply-To:  <89C159F45B13A24682D98BDEF58E451F28559C95@TLRUSMNEAGMBX28.ERF.THOMSON.COM>
Content-Type: text/plain; charset="iso-8859-1"

Matt -

Proc Import reads the whole file, while using INFILE and INPUT you can pick the cols you need. This will only read the 5/200th of the data you are interested in (might be even faster than your 5/45ths goal), plus your log shouldn't fill up.

Are your files tab delimited with uniform names at the top?

It's possible to read the first line of a text file, SCAN the positions of the columns you need, and then use that information to INPUT the 5 select columns.

If your data don't have uniform names at the top I don't know how you intend to distinguish them.

Paul Choate DDS Data Extraction (916) 654-2160

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Matthew Pettis Sent: Thursday, May 27, 2010 12:13 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Redirect SAS Log (temporarily) to a Windows '/dev/null' equivalent?

Hi Tom,

A couple of constraints for me...

1. I know the first 3 cols are always the same and in the same order, the rest of the columns are not, and sometimes a column doesn't appear in certain files. They are all numeric. So I have a some information about the nature of the columns, but not all.

2. I want column names out of the gate, and usually I want 5 cols out of ~100. With reading in a file straight, I'd have to find the column, remove the others. PROC IMPORT is somewhat helpful here.

3. But PROC IMPORT is not really that fast. I assume it is due to I/O. What I do know is that I can reduce a 45min job to under 5min by doing this pre-processing and making SAS do only 1 read from pipe and 1 read from disk. That's big.

SAS is pretty good at text manipulation, but there are lighter and faster pre-processors that can make for a significant speedup in overall processing time.

Thanks, Matt

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Tom Abernathy Sent: Thursday, May 27, 2010 12:10 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Redirect SAS Log (temporarily) to a Windows '/dev/null' equivalent?

SAS is very good at text manipulation, why not just pass it through SAS? One of my standard tools for looking at an unknown delimited file is to read it into character variables in SAS and see what is there.

infile 'what-is-this.csv' dsd truncover; length x1-x50 $200; input x1-x50;

On Thu, 27 May 2010 09:13:29 -0500, Matthew Pettis <matt.pettis@THOMSONREUTERS.COM> wrote:

>Thanks Soren! > >My 6,000 files aren't *quite* guaranteed to have exactly the same layout, but I'm thinking of coercing them into one by reading the file and piping it through an appropriate 'logparser' query (think awk restricted to very rectangular data for Windows), and then providing a single datastep (in fact, I'm going to make it a view so as to reduce storage and I/O in my job). > >Thanks again, >Matt > >-----Original Message----- >From: Søren Lassen [mailto:s.lassen@POST.TELE.DK] >Sent: Wednesday, May 26, 2010 11:57 PM >To: SAS-L@LISTSERV.UGA.EDU; Pettis, Matthew (Legal) >Subject: Re: Redirect SAS Log (temporarily) to a Windows '/dev/null' equivalent? > >Matt, >You can route output in Windows to nowhere by using the 'NUL' physical >filename, e.g. > >PROC PRINTTO log='NUL';run; > >You can turn off the dump of the datastep by using > >OPTIONS NOSOURCE; > >as the 'data' step, as you call it, is a very real datastep, generated >by PROC IMPORT and then submitted. If your 6000 .csv files all have >the same format, you can also import one of them, then recall the datastep >code (the recall button, typically F4, in the program editor) and >edit it to read all of the files (use the FILEVAR and EOV options for >the INFILE statement) in one fell swoop in a single datastep. Probably a >lot faster than doing 6000 PROC IMPORTs. > >Regards, >Søren > >On Wed, 26 May 2010 21:01:50 -0500, Matthew Pettis ><matt.pettis@THOMSONREUTERS.COM> wrote: > >>Hi, >> >> >> >>I'm trying to import ~6000 .csv files with PROC IMPORT, and the .csv >>files are very wide. The problem is that my SAS LOG fills up very >>quickly in interavtive mode, and in batch mode, takes up a lot of space >>in the LOG file that I immediately delete because it's so large. >> >> >> >>I'd like a way to redirect the SAS LOG to the equivalent of /dev/null >>for the duration of the imports (which is encapsulated in a macro). >>Right now, I'm temporarily diverting the LOG to an external .log file >>with a PROC PRINTTO statement, and then immediately deleting it. I'd >>like to avoid the I/O in the first place if possible. >> >> >> >>On a very related note, since these are wide .csv files, the PROC IMPORT >>statement generates very large LOG entries with the underlying 'data' >>step that does the import in the LOG. Is there *any* way to turn off >>the dump of the 'data' step to the LOG? >> >> >> >>Thanks in advance, >> >>Matt


Back to: Top of message | Previous page | Main SAS-L page