LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (August 2007, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 28 Aug 2007 20:20:57 -0700
Reply-To:   David L Cassell <davidlcassell@MSN.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   David L Cassell <davidlcassell@MSN.COM>
Subject:   Re: Processing Multiple data files
In-Reply-To:   <200708261926.l7QAklKT001283@malibu.cc.uga.edu>
Content-Type:   text/plain; format=flowed

randistan69@HOTMAIL.COM wrote back: > >On Thu, 23 Aug 2007 23:40:00 -0400, Howard Schreier <hs AT dc-sug DOT org> ><nospam@HOWLES.COM> wrote: > > >On Thu, 23 Aug 2007 13:24:03 -0400, SUBSCRIBE SAS-L Anonymous > ><randistan69@HOTMAIL.COM> wrote: > > > >>On Thu, 23 Aug 2007 03:32:44 -0400, SUBSCRIBE SAS-L Anonymous > >><randistan69@HOTMAIL.COM> wrote: > >> > >>>Dear All: > >>> I have multiple files (about 500) in txt format. They are named >file1, > >>>file2 and so on. I am using the following INFILE statement to read the > >>>data: > >>> > >>>Infile 'C:\Documents and Settings\AAA\Desktop\file1.txt' DLM = ',' > >>>Lrecl = 32000 DSD Truncover; > >>> > >>>FInally I want to save the output as Output1, Output2...etc in >Mylibrary > >>> > >>>So the last line of the code is: > >>> > >>>proc sort data = example out = mylibrary.output1 ; by VarA VarB VarC ; > >>>run; > >>> > >>>Can I process the data using a BAT file where I need a wildcard for the > >>>file1.txt statement and also for the statement out = mylibrary.output1 >. > >>> > >>>Or will a Macro be be preferred? > >>> > >>>Thanx for the help in advance > >>> Randy > >> > >>All: > >> Every time I read the txt files in I have to type test1, test2 and so >on > >>in the INFILE statement. Besides, the code for each file takes about 30 > >>minutes to run. That is why I wanted help to determine if I could use a > >>Macro or use batch processing for these files and save the output as > >>output1, output2 etc. > >> Please help. > >> Randy > > > >That's 30 minutes, not 30 seconds? > > > >You are in a performance-tuning situation. > > > >Perhaps you should explain a bit more. How large is each file? What is >the > >dimension distinguishing the 500 from one another? What is the nature of >the > >data and the tasks to be performed after you have stored all of the data? > >Howard: > Each file has approximately 3.5 million to 4 million observations. To >start there are about 50 variables across and after manipulation of the >the data there are about 120 variables. There is no difference between >these 500 files: All have the same dimensions and variables. > I cannot use a set command because it is much better to manipulate >individual files than handle one single huge file. It takes about 30-45 >minutes to run the code on each individual files and I need to find a more >efficient way to run the data. Perhaps one way is to reduce the number of >Proc Sorts. > Randy

I'm going to disagree.

I think Howard is right. (Well, duh. Howard's always right.)

You are going to have a miserable time accessing all 500 tables separately, over and over and over. You are going to have a miserable time storing all that extra crud from those 70 extra variables. (I have to wonder if they are really needed.)

You would be way better off using a data step view here. Create a view that has the new variables (if you insist - I would hesitate on that) and puts the tables in a tall-and-thin view.

SAS works better with by-processing and tall-and-thin tables.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Booking a flight? Know when to buy with airfare predictions on MSN Travel. http://travel.msn.com/Articles/aboutfarecast.aspx&ocid=T001MSN25A07001


Back to: Top of message | Previous page | Main SAS-L page