Date: Thu, 10 May 2001 10:12:31 -0400
Reply-To: Ian Whitlock <WHITLOI1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <WHITLOI1@WESTAT.COM>
Subject: DATA Step Debugger
Content-Type: text/plain; charset="iso-8859-1"
Subject: DATA Step Debugger
Summary: Meaningless assignment appears to change the flow of control
Respondent: Ian Whitlock <whitloi1@westat.com>
A colleague posted code similar to that below as a solution to a
lookup problem where there were duplicate keys in the FIND file and
the LOOKUP file. My question is not about his problem or solution,
but about the DATA step debugger. I added this option to his last
step in order to see exactly how it was executed.
The first time through the inner loop, the inner loop appears to
execute once with control going to the outer loop. I found this very
surprising! When the asterisk is removed from the comment line
"*x = 1 ;", the flow returns to what I would expect - the inner loop
executing more than once.
Since the program executes correctly in either case I suspect the
debugger is showing the result of some optimization that is not
possible with the modified line. However, I find it very puzzling
and would like other opinions.
Here is the code:
/* Large dataset - find all values for keys in find */
data find;
input key;
CARDS;
5
6
6
;
/* Lookup dataset contains duplicate keys */
data lookup(index=(key));
input key value ;
CARDS;
3 1
5 2
5 3
6 4
;
/* Why is the flow of control changed
when the * is removed from the first line
of the outer loop?
*/
data _null_ / debug ;
set find ;
currkey = key ;
reset = 1;
key = .;
do until (_iorc_ ~= 0);
* x = 1 ;
do until (reset = 0);
if key ~= . then reset = 0;
set lookup key=key ;
key = currkey ;
end;
if _iorc_ = 0 then put _n_= key= value= ;
else _error_ = 0;
end;
run;
Please execute and tell me what you think about it. Is it a bug
or optimization? If the latter, then what sort of optimization is
it?
Ian Whitlock