LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2002, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 13 Jun 2002 13:44:58 -0600
Reply-To:     Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Subject:      Re: dow loop?
Comments: To: Cassell.David@EPAMAIL.EPA.GOV
Content-Type: text/plain; charset=US-ASCII

I think "Whitlock loop" would be a better term. "DOW loop" sounds like it has something to do with industrial averages. And why is the "O" capitalized?

-- JackHamilton@FirstHealth.com Manager, Technical Development METRICS Department, First Health West Sacramento, California USA

>>> "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV> 06/13/2002 11:07 AM >>> Jay Weedon <jweedon@EARTHLINK.NET> wrote: > Sorry to display such ignorance. Can someone point me to an > explanation of what this term means & what it's used for?

Jay, It's not you. It's us. The term has become nearly standard on SAS-L for the Whitlock do-loop. [I believe Paul coined the term, but if I'm wrong I will be promptly corrected - and if I'm right, I may still be promptly corrected! :-] That's the situation where one puts the SET statement inside the do-loop and uses some data set information to end the loop (typically, last.whatever or end-of-file). Ian or Paul [or any other list guru] may wish to refine my loose description.

I was recently confronted with the issue of documenting my use of the DOW-loop in some production code, and I asked Paul off-list what he did in the same situation. Here is the disclaimer he suggested:

------------------------------------------------------------------------ This program may contain one or more constructs similar to the following:

Data <...Data Set Names...> ; <...Stuff Executed Before Break-Event... > ; Do <...Cnt-Var = From-Var By Step-Var...> Until ( Break-Event ) ; Set A ; <...Stuff Executed For Each In-Record...> ; End ; <...Stuff Executed After Break-Event... > ; Run ;

<The code between angle brackets is, generally speaking, optional.> We call the structure the DOW-loop, where W stands for Ian Whitlock.

The intent of organizing such a structure is to achieve a logical isolation of instructions executed between two successive break-events from actions performed before and after a break-event, in the most programmatically natural manner. In most (but not all) situations, the input data set is grouped and/or sorted, and the break-event occurs when the last record in a by-group has been processed. In such a case, the DOW-loop logically separates actions performed (1) before the first record in a by-group is read, (2) for each record in the group, and (3) after the last record in the group is read.

Example: Input file A is sorted by ID. This step multiplies and summarizes all VAR values within each ID-group, counts the number of all and non-missing records in each group, finds the group average, and writes 1 record with COUNT, SUM, MEAN and PROD to file B after each by-group:

Data B ( Keep = Id Prod Sum Count Mean) ; Prod = 1 ; Do Count = 1 By 1 Until ( Last.Id ) ; Set A ; By Id ; If Var <= .Z Then Continue ; Mcount = Sum (Mcount, 1) ; Prod = Prod * Var ; Sum = Sum (Sum, Var) ; End ; Mean = Sum / Mcount ; Run ;

How it works (1, 2, 3 denote stuff performed before, between, and after break-event<s>): (1) PROD and COUNT are set to 1, and the non-retained SUM, MEAN, and MCOUNT are set to missing by default (control is at the top of the Data step). (2) DOW-loop starts to iterate, reading the next record from A at the top of every iteration. While it iterates, control never leaves the Do-End boundaries. If VAR is missing, CONTINUE passes control straight to the bottom of the loop, otherwise MCOUNT, PROD and SUM are computed. After the last record in the group is processed, the loop stops. At this point, PROD, COUNT, SUM, and MEAN contain the group-aggregate values. (3) Control is transferred to the statement following the loop. MEAN is computed, and control is passed to the bottom of the step, where the implicit OUTPUT writes the record to B. Control is passed to the top of the step, the variables are re-initialized, and the next group is processed.

Note: Contrary to the common practice, the accumulation variables need NOT be retained. Because the DOW-loop passes control to the top of the Data step ONLY before the first record in a by-group is to be read, this is the only point where non-retained variables are reset to missing, and it is exactly where this action is required. ------------------------------------------- -- Paul Dorfman 2001/08/11 --

I think that describes the situation better than I would have.

David -- David Cassell, CSC Cassell.David@epa.gov Senior computing specialist mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page