| Date: | Wed, 14 Apr 2004 19:48:25 -0600 |
| Reply-To: | Jack Hamilton <JackHamilton@FIRSTHEALTH.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Jack Hamilton <JackHamilton@FIRSTHEALTH.COM> |
| Subject: | Re: Job control with SAS |
|
| Content-Type: | text/plain; charset=us-ascii |
You read my question correctly - I want to be able handle concurrencies,
while running as many non-dependent programs as possible.
I'm afraid that I don't usually write well-behaved programs, by your
definition - I rely on SAS to clean up my work data sets. In fact, I
don't think it's possible for a SAS program to completely clean out its
own WORK directory - WORK,REGSTRY and WORK.SASMACR refuse to be deleted
by PROC DATASETS KILL because they're "in use".
The plain %INCLUDE approach doesn't work for me because, although it
will run jobs in sequence, it will stop after the first failure, even
though there are non-dependent jobs remaining which could still be run.
Given
SEQ NAME
1 Prog1
1 Prog2
2 Prog3
where Prog3 is dependent on Prog1 and Prog2, if Prog1 fails then Prog2
won't run, even though it could.
I think SYSTASK is probably the most reasonable approach, but I've been
told offline that it sometimes results in mysterious abends; I'll have
to try it for myself.
--
JackHamilton@FirstHealth.com
Manager, Technical Development
Metrics Department, First Health
West Sacramento, California USA
>>> Lou <lpogodajr292185@COMCAST.NET> 04/14/2004 6:27 PM >>>
"Don Stanley" <don_stanley@PARADISE.NET.NZ> wrote in message
news:200404140152.i3E1qPu10992@listserv.cc.uga.edu...
> Few, usually, minor pitfalls to beware of with this approach
>
> (1) create a macro variable in job 1. Have a variable of the same
name in
> job 2 but forget to reset it to null in job 2. Testing job 2
independent
of
> job 1 works fine, sequential running like below may cause very
obscure
> errors in job 2.
A well behaved program, like a well behaved houseguest, cleans up
after
itself. No self-respecting program written by a competent programmer
would
dream of leaving work datasets, macro variables, titles, footnotes,
libnames, filenames, format or other catalogs, etc. behind. Super
proper
programs will also reset options to the status quo ante.
If you're saddled with programmers who refuse to behave with
reasonable
politeness, you can write generic clean up code and run it after each
%included program. If you want, you can make it a program all by
itself and
%include it too.
> (2) if errorabend is switched on and job 1 fails, then jobs 2 and 3
will
> not run, but they would not have if running from a scheduler with no
> dependency (as originally stated by Jack some jobs can run concurrent
but
> this approach has forced them sequential)
Maybe I misread - I took Jack to mean that his present approach
sometimes
results in programs running concurrently and that this was the problem
he
was trying to solve - some of the programs did have dependencies and
the
program n was off and running before program n-1, on which it was
dependent,
was finished. After all, he does ask if there's a way to run the jobs
"in
order".
> (3) forces sequential running when jobs may be able to be run
together
I guess I'm getting to be an old fuddy-duddy. Someone says "job
control"
and I immediately think Job Control Language (JCL). That's what a JCL
rundeck does - kicks off programs sequentially. If you have a bunch
of
programs that can run concurrently (and you're running windows) you
can
highlight them all, right click, and click on batch submit - job
control
isn't an issue. In this case, it seems that it is - see the previous
comment.
> (4) I have found various abend and Task Violations to occur when
running
> large jobs with lots of macro variables sequentially like this, but
they
do
> not occur when running in parallel or in different SAS sessions. SAS
> thought a memory leak might be the problem, but absolutely no idea
when it
> could occur
This hasn't happened to me, nor apparently to anyone else where I work
(at
least, no one has ever called it to my attention), so I can't
meaningfully
comment.
> This is a perfectly valid use of %include, which I tried at this site
and
> discarded due to some of above issues.
>
> Don
>
> On Tue, 13 Apr 2004 21:04:03 -0400, Lou
<lpogodajr292185@COMCAST.NET>
wrote:
>
> >Much easier, in my opinion, to just invoke each program in turn with
an
> >%include command. For instance, you could have a program file
called
"run
> >all programs.sas" that said:
> >
> >%include program1.sas
> >%include program2.sas
> >.
> >.
> >.
> >
> >Each program will be run in the order listed, and no program will
start
> >before all the preceding ones finish. You can fully qualify the
program
> >file names so that programs in various folders or in a folder
different
> from
> >the one where "run all programs.sas" is located are included.
> >
> >No new material below, included for reference only.
> >
> >"Jack Hamilton" <JackHamilton@FIRSTHEALTH.COM> wrote in message
> >news:s07bfe57.034@SLCM02.firsthealth.com...
> >> I have SAS programs which are usually run in sequence. It's easy
to
put
> >> the names in a file and then use the X command to run the programs
one
> >> after another, giving a sort of rudimentary job control:
> >>
> >> =====
> >> data _null_;
> >>
> >> infile cards;
> >>
> >> input @1 jobname $40.;
> >>
> >> now = datetime();
> >>
> >> if jobname =: '*' or jobname =: '#' then
> >> do;
> >> put 'INFO:' now datetime16.0 ' Program skipped: ' jobname;
> >> return;
> >> end;
> >>
> >> put 'INFO: ' now datetime16.0 ' Starting ' jobname;
> >>
> >> rc = system('$sascmd -nodms -noterminal -errorabend -rsasuser '
||
> >> jobname);
> >>
> >> now = datetime();
> >> if rc gt 1 then
> >> do;
> >> put 'ERROR: ' now datetime16.0 ' Return code ' rc 4.0 ' from
'
> >> jobname;
> >> abort abend;
> >> end;
> >> else
> >> put ' ' now datetime16.0 ' Return code ' rc 4.0 ' from
'
> >> jobname;
> >>
> >> put;
> >>
> >> cards;
> >> 01-get-dw-claims-med.sas
> >> 02-get-dw-claims-rx.sas
> >> =====
> >>
> >> But sometimes several of the jobs could be run at the same time,
with
> >> subsequent jobs dependent on them; for example, I might have 5
jobs
each
> >> of which processes a year's worth of data 2000-2005, and a fifth
job to
> >> combine the results. I might express the job flow like this:
> >>
> >> -----
> >> SEQ Program
> >> 1 process-2000.sas
> >> 1 process-2001.sas
> >> 1 process-2002.sas
> >> 1 process-2003.sas
> >> 1 process-2004.sas
> >> 2 combine-years.sas
> >> -----
> >>
> >> I could do that manually using MP CONNECT. Has anyone already
written
> >> a program (that they could share) which would read the job list
and
> >> produce the SAS code needed to process the jobs in order? I'm
lazy and
> >> don't want to reinvent it if it's already been done.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> JackHamilton@FirstHealth.com
> >> Manager, Technical Development
> >> Metrics Department, First Health
> >> West Sacramento, California USA
|