LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2004, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 14 Apr 2004 19:48:25 -0600
Reply-To:   Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Subject:   Re: Job control with SAS
Comments:   To: lpogodajr292185@COMCAST.NET
Content-Type:   text/plain; charset=us-ascii

You read my question correctly - I want to be able handle concurrencies, while running as many non-dependent programs as possible.

I'm afraid that I don't usually write well-behaved programs, by your definition - I rely on SAS to clean up my work data sets. In fact, I don't think it's possible for a SAS program to completely clean out its own WORK directory - WORK,REGSTRY and WORK.SASMACR refuse to be deleted by PROC DATASETS KILL because they're "in use".

The plain %INCLUDE approach doesn't work for me because, although it will run jobs in sequence, it will stop after the first failure, even though there are non-dependent jobs remaining which could still be run. Given

SEQ NAME 1 Prog1 1 Prog2 2 Prog3

where Prog3 is dependent on Prog1 and Prog2, if Prog1 fails then Prog2 won't run, even though it could.

I think SYSTASK is probably the most reasonable approach, but I've been told offline that it sometimes results in mysterious abends; I'll have to try it for myself.

-- JackHamilton@FirstHealth.com Manager, Technical Development Metrics Department, First Health West Sacramento, California USA

>>> Lou <lpogodajr292185@COMCAST.NET> 04/14/2004 6:27 PM >>> "Don Stanley" <don_stanley@PARADISE.NET.NZ> wrote in message news:200404140152.i3E1qPu10992@listserv.cc.uga.edu... > Few, usually, minor pitfalls to beware of with this approach > > (1) create a macro variable in job 1. Have a variable of the same name in > job 2 but forget to reset it to null in job 2. Testing job 2 independent of > job 1 works fine, sequential running like below may cause very obscure > errors in job 2.

A well behaved program, like a well behaved houseguest, cleans up after itself. No self-respecting program written by a competent programmer would dream of leaving work datasets, macro variables, titles, footnotes, libnames, filenames, format or other catalogs, etc. behind. Super proper programs will also reset options to the status quo ante.

If you're saddled with programmers who refuse to behave with reasonable politeness, you can write generic clean up code and run it after each %included program. If you want, you can make it a program all by itself and %include it too.

> (2) if errorabend is switched on and job 1 fails, then jobs 2 and 3 will > not run, but they would not have if running from a scheduler with no > dependency (as originally stated by Jack some jobs can run concurrent but > this approach has forced them sequential)

Maybe I misread - I took Jack to mean that his present approach sometimes results in programs running concurrently and that this was the problem he was trying to solve - some of the programs did have dependencies and the program n was off and running before program n-1, on which it was dependent, was finished. After all, he does ask if there's a way to run the jobs "in order".

> (3) forces sequential running when jobs may be able to be run together

I guess I'm getting to be an old fuddy-duddy. Someone says "job control" and I immediately think Job Control Language (JCL). That's what a JCL rundeck does - kicks off programs sequentially. If you have a bunch of programs that can run concurrently (and you're running windows) you can highlight them all, right click, and click on batch submit - job control isn't an issue. In this case, it seems that it is - see the previous comment.

> (4) I have found various abend and Task Violations to occur when running > large jobs with lots of macro variables sequentially like this, but they do > not occur when running in parallel or in different SAS sessions. SAS > thought a memory leak might be the problem, but absolutely no idea when it > could occur

This hasn't happened to me, nor apparently to anyone else where I work (at least, no one has ever called it to my attention), so I can't meaningfully comment.

> This is a perfectly valid use of %include, which I tried at this site and > discarded due to some of above issues. > > Don > > On Tue, 13 Apr 2004 21:04:03 -0400, Lou <lpogodajr292185@COMCAST.NET> wrote: > > >Much easier, in my opinion, to just invoke each program in turn with an > >%include command. For instance, you could have a program file called "run > >all programs.sas" that said: > > > >%include program1.sas > >%include program2.sas > >. > >. > >. > > > >Each program will be run in the order listed, and no program will start > >before all the preceding ones finish. You can fully qualify the program > >file names so that programs in various folders or in a folder different > from > >the one where "run all programs.sas" is located are included. > > > >No new material below, included for reference only. > > > >"Jack Hamilton" <JackHamilton@FIRSTHEALTH.COM> wrote in message > >news:s07bfe57.034@SLCM02.firsthealth.com... > >> I have SAS programs which are usually run in sequence. It's easy to put > >> the names in a file and then use the X command to run the programs one > >> after another, giving a sort of rudimentary job control: > >> > >> ===== > >> data _null_; > >> > >> infile cards; > >> > >> input @1 jobname $40.; > >> > >> now = datetime(); > >> > >> if jobname =: '*' or jobname =: '#' then > >> do; > >> put 'INFO:' now datetime16.0 ' Program skipped: ' jobname; > >> return; > >> end; > >> > >> put 'INFO: ' now datetime16.0 ' Starting ' jobname; > >> > >> rc = system('$sascmd -nodms -noterminal -errorabend -rsasuser ' || > >> jobname); > >> > >> now = datetime(); > >> if rc gt 1 then > >> do; > >> put 'ERROR: ' now datetime16.0 ' Return code ' rc 4.0 ' from ' > >> jobname; > >> abort abend; > >> end; > >> else > >> put ' ' now datetime16.0 ' Return code ' rc 4.0 ' from ' > >> jobname; > >> > >> put; > >> > >> cards; > >> 01-get-dw-claims-med.sas > >> 02-get-dw-claims-rx.sas > >> ===== > >> > >> But sometimes several of the jobs could be run at the same time, with > >> subsequent jobs dependent on them; for example, I might have 5 jobs each > >> of which processes a year's worth of data 2000-2005, and a fifth job to > >> combine the results. I might express the job flow like this: > >> > >> ----- > >> SEQ Program > >> 1 process-2000.sas > >> 1 process-2001.sas > >> 1 process-2002.sas > >> 1 process-2003.sas > >> 1 process-2004.sas > >> 2 combine-years.sas > >> ----- > >> > >> I could do that manually using MP CONNECT. Has anyone already written > >> a program (that they could share) which would read the job list and > >> produce the SAS code needed to process the jobs in order? I'm lazy and > >> don't want to reinvent it if it's already been done. > >> > >> > >> > >> > >> > >> > >> > >> -- > >> JackHamilton@FirstHealth.com > >> Manager, Technical Development > >> Metrics Department, First Health > >> West Sacramento, California USA


Back to: Top of message | Previous page | Main SAS-L page