Date: Wed, 3 Jul 2002 10:54:52 -0400
Reply-To: Ian Whitlock <WHITLOI1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <WHITLOI1@WESTAT.COM>
Subject: Re: Using a macro variables.
Content-Type: text/plain; charset="iso-8859-1"
Subject: Using a macro variables.
Summary: Confusion between the role of arrays and that of macro variables
Respondent: IanWhitlock@westat.com
Jess Balint [JBalint@ALLDATA.NET] asked a common question about why some
of his DATA step code isn't working.
> if first.consumer_id then do;
> %let x = 1;
> end; else do;
> %let x = x + 1;
> end;
> bank_card_&x_src = bcrd_src;
> bank_card_&x_date = bcrd_date;
Ed Heaton [edwardheaton@westat.com] and Otto Schramek
[otto.schramek.os@BAYER-AG.DE] have given two solutions without really
looking at the root cause of the problem. I am more interested in why it
is a problem than the solution.
The %LET statements are executed before the DATA step is fully
compiled as Ed explained. They could even be put in front of the DATA
step without any change in code behavior. Since the second %LET comes
last the final value of X will be 5 characters, X + 1, which of
course cannot be part of a SAS name as the code is written. What
about the DO-blocks? They are empty since the macro instructions have
already been executed. (Note they were not conditionally executed, both
%LET's were executed in order.)
Another way to look this is to think, I have statements
bank_card_1_src = bcrd_src;
bank_card_2_src = bcrd_src;
bank_card_3_src = bcrd_src;
bank_card_4_src = bcrd_src;
...
My problem is to decide which one to execute. Can that problem be solved
at compile time? No, because all of the statements must be executed at
some time. Macro cannot provide a solution. When you write
bank_card_&x_src = bcrd_src;
the question is which value of X will the compiler see. It can only
see one value, so no matter what form that value has it cannot be
correct. The problem must be solved by either allowing a data value to
determine a variable name or a lot of decision making code. Arrays
provide the ability to refer to many different variables via a common
name and data value. Hence arrays will provide a good answer using
DATA step code.
Now what is the real problem? The variable names. Someone chose
names like BANK_CARD_1_SRC, BANK_CARD_2_SRC. This is a classic case
of how to make life difficult for the programmer. It is perhaps the
root cause of why Jess turned to using macro, since only macro could
mess with the variable names. The mistake lay in not realizing that
although macro could choose amongst different names it still could
choose only one and that decision would be before the DATA step
executed. (Another root cause may be the learning of macro prior to a
sound understanding of how to code in SAS.)
Had the index been stuck on the end of the name one could write
array bank_card_src (*) bank_card_src1 - bank_card_src&n ;
and
bank_card_src [x] = bcrd_src ;
The only problem here is to determine an appropriate size for N in a
preparatory step. Of course, for some customers the appropriate size
may be 1 or 2, and for others hundreds; hence there may not be an
appropriate size. That is a good indication that perhaps the problem
is wrong. In many cases the original data would be far more flexible
and easier to work with than the transformed data. So why should it
be done in the first place?
The advantage of Otto's solution is that it avoids the need for
calculating N, since transpose will use the maximum required size as
one must. But it does not avoid the fact that the data has been made
harder to work with. It also did not solve the problem of getting the
names that Jess wanted, although a little renaming macro could do
that. The fact that such a simple problem requires macro at all is a
symptom that there is something wrong with the problem or the language.
In this case I see it as an indictment against the problem rather than
the language. In any case I hope some reader will have a clearer
understanding of how to make life difficult for the programmer with
this classic case. Of course, it is just a special case of the
general principle that when you ask a programmer to do the wrong thing
in the wrong form there will be some nasty consequences. If the only
consequence is to keep a programmer employed by working longer and
harder on a problem than need be, perhaps one should not try to expose
the principle, but rather to employ it more often.
Ian Whitlock