Date: Thu, 13 Jun 2002 11:07:48 -0700
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV>
Subject: Re: dow loop?
Content-type: text/plain; charset=us-ascii
Jay Weedon <jweedon@EARTHLINK.NET> wrote:
> Sorry to display such ignorance. Can someone point me to an
> explanation of what this term means & what it's used for?
It's not you. It's us. The term has become nearly standard on
SAS-L for the Whitlock do-loop. [I believe Paul coined the term, but if
wrong I will be promptly corrected - and if I'm right, I may still be
promptly corrected! :-] That's the situation where one puts the SET
statement inside the do-loop and uses some data set information to end
the loop (typically, last.whatever or end-of-file). Ian or Paul [or
any other list guru] may wish to refine my loose description.
I was recently confronted with the issue of documenting my use of the
DOW-loop in some production code, and I asked Paul off-list what he did
in the same situation. Here is the disclaimer he suggested:
This program may contain one or more constructs similar to the
Data <...Data Set Names...> ;
<...Stuff Executed Before Break-Event... > ;
Do <...Cnt-Var = From-Var By Step-Var...> Until ( Break-Event ) ;
Set A ;
<...Stuff Executed For Each In-Record...> ;
<...Stuff Executed After Break-Event... > ;
<The code between angle brackets is, generally speaking, optional.> We
the structure the DOW-loop, where W stands for Ian Whitlock.
The intent of organizing such a structure is to achieve a logical
of instructions executed between two successive break-events from
performed before and after a break-event, in the most programmatically
natural manner. In most (but not all) situations, the input data set is
grouped and/or sorted, and the break-event occurs when the last record
by-group has been processed. In such a case, the DOW-loop logically
separates actions performed (1) before the first record in a by-group is
read, (2) for each record in the group, and (3) after the last record in
group is read.
Example: Input file A is sorted by ID. This step multiplies and
all VAR values within each ID-group, counts the number of all and
non-missing records in each group, finds the group average, and writes 1
record with COUNT, SUM, MEAN and PROD to file B after each by-group:
Data B ( Keep = Id Prod Sum Count Mean) ;
Prod = 1 ;
Do Count = 1 By 1 Until ( Last.Id ) ;
Set A ;
By Id ;
If Var <= .Z Then Continue ;
Mcount = Sum (Mcount, 1) ;
Prod = Prod * Var ;
Sum = Sum (Sum, Var) ;
Mean = Sum / Mcount ;
How it works (1, 2, 3 denote stuff performed before, between, and after
break-event<s>): (1) PROD and COUNT are set to 1, and the non-retained
MEAN, and MCOUNT are set to missing by default (control is at the top of
Data step). (2) DOW-loop starts to iterate, reading the next record
at the top of every iteration. While it iterates, control never leaves
Do-End boundaries. If VAR is missing, CONTINUE passes control straight
the bottom of the loop, otherwise MCOUNT, PROD and SUM are computed.
the last record in the group is processed, the loop stops. At this
PROD, COUNT, SUM, and MEAN contain the group-aggregate values. (3)
is transferred to the statement following the loop. MEAN is computed,
control is passed to the bottom of the step, where the implicit OUTPUT
writes the record to B. Control is passed to the top of the step, the
variables are re-initialized, and the next group is processed.
Note: Contrary to the common practice, the accumulation variables need
be retained. Because the DOW-loop passes control to the top of the Data
ONLY before the first record in a by-group is to be read, this is the
point where non-retained variables are reset to missing, and it is
where this action is required.
-- Paul Dorfman 2001/08/11 --
I think that describes the situation better than I would have.
David Cassell, CSC
Senior computing specialist