Date: Sun, 30 Apr 2006 20:19:07 +0000
Reply-To: iw1junk@COMCAST.NET
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <iw1junk@COMCAST.NET>
Subject: Re: How to automatically change data name in SET?
Subject: How to automatically change data name in SET?
Summary: A short lesson in the principles of writing macros
Respondent: Ian Whitlock iw1junk@comast.net
Lei,
I see you have gotten some excellent answers, particularly Paul's,
which hits the points I would make. This question and your other
request for a macro book recommendation indicate that it may be
worth a detailed response.
To repeat Paul's concern, it is very unlikely that you want something
like
> ... for m = 3
> DATA SUBRO.RESULT;
> SET SUBRO.MYDATA1;
> SET SUBRO.MYDATA2;
> SET SUBRO.MYDATA3;
> RUN;
Why? The only time this would make sense is when
1) All the data sets had precisely the same length
2) Each data set has variables with different names
3) You want to combine the information for the Nth
observation of each set into one observation on
the resulting file.
Why? 1) The step would stop on the SET statement for the
shortest data set when there were no more records. 2) If
any variable name was common to two or more of these data
sets, the value for the last one would win and all others
got to the great pit in the sky. 3) If the first two
conditions are met, then that is what your step will produce.
Now everyone has given you various methods producing something
closer to what you may want. However, the above request is
so unlikely, that it suggests that you have much to learn
about SAS before it makes sense to look at a book on macro.
On the other hand, I cannot resist the impulse to teach some
principles, so let's consider the problem again.
You have a step
data w ;
set w1 ... wM ;
run ;
You want to make something variable. First step - make it a
macro variable, so we have
data w ;
set &MACVAR ;
run ;
Now the question becomes how to make the macro variable. Simplest
is
%let macvar = w1 w2 w3 ... ;
where the dots are appropriately filled in. This is not automatic,
but it is simple and probably best considering your present state
of SAS knowledge.
The second possibility is
%let macvar = %mac(pref=w,n=4) ;
Now we are down to changing just one thing, the value of N. I do not
see precisely this form so let's add it.
%macro mac ( pref= , n=1 ) ;
%local i list ; %* always declare macro variables local! ;
%do i = 1 %to &n ; %* a loop is required ;
%let list = &list &pref&i ; %* concatenate list of names ;
%end ;
%* return, i.e. generate text ;
&list /* NO semi-colon */
%mend mac ;
This macro has many purposes, not just the one you asked for. Hence it
is worth writing. THAT'S IMPORTANT! Macro's for just one purpose are
probably not worth writing unless the purpose is great enough to demand
the macro in spite of its singular use.
There are two things here. One - isolate the change in a variable.
Two - separate the means of creating the value from the consuming code.
Finally, we don't need the variable in this case because the macro contains
no SAS code other than the desired text. It would be better to write
data w ;
set %mac(pref=w,n=4) ;
run ;
Now the question becomes - what does automate mean? And, what does it
cost? Everything in programming, in fact everything in life, gives you
something and costs you something. The trick is to get at least what it
costs. For example, very few SAS programmers would write
data _null_ ;
ans = 2 + 3 ;
put ans = ;
run ;
to find the sum of 2 and 3. Why? The cost of even typing these 4 lines
is too great for almost all of them.
What about the first part of the question? One interpretation of automate
would be - the computer knows the date and in principle the month number
so we could assign N with an appropriate expression. What is the cost?
Well, you couldn't run the program on May 1, and have it generate only
4 elements in the list. You couldn't repeat a run you made in March when
the boss requests that you use exactly the same code. Now we can compare,
Is it more important to have the simplicity of automate?
Or to be removed from the responsibility for the value of N?
Assuming the convenience of automatic month is greater, we could code
%let n = %sysfunc(month("&sysdate"d)) ;
data w ;
set %mac(pref=w , n=&n) ;
run ;
What if we need to run with the march assumption? The first line
becomes
%let n = 3 ; * %sysfunc(month("&sysdate"d)) ;
So a very minor change is needed, i.e. we got our convenience and have
to pay very little when more flexibility is required.
Compare that with Paul's corresponding code.
> %macro cat (lib=, mem= ) ;
> %local i ;
> %do i = 1 %to %sysfunc (month(%sysfunc (date()))) ;
> &lib..&mem&I
> %end ;
> %mend cat ;
>
> data subro.result ;
> set %cat (lib=work, mem=mydata) ;
> run ;
To run in March, you would have to discard the macro call by changing
the inner line to
set mydata1 mydata2 mydata3 ; * %cat (lib=work, mem=mydata) ;
The cost is higher because you have to locate the line and have more to
type. But even these little things add up as the number of lines in a
program increases. When you add the fact that MAC can be used in
making many more lists, I see a clear advantage to MAC. So why didn't
Paul do it? The answer isn't "Because Paul doesn't think about these
things." It is because Paul was giving a simple answer to your question
and the only cost to Paul was the time spent typing the answer.
How does this fit with Gregg's code
> proc sql noprint;
> select memname into :datasets separated by " "
> from sashelp.vtable
> where libname="SUBRO" and memname like "MYDATA%";
> quit;
>
> %put datasets=&datasets;
>
> data subro.result;
> set &datasets;
> run;
Well it is back to first principles. %LET isn't the only way
to create a macro variable, and Gregg has given you an example
of a very valuable one. He also assumed a different
interpretation of what automate means. His code solves the
problem of running on May 1 when the May data set isn't ready
by using all data set names of a certain form in the given
directory. This could be more appropriate the date interpretation
than Paul made, or it could cause problems when somebody added
an unexpected data set with a name like "MYDATA%". You should
also consider the cost of trying to run this code as if it were March.
Again, like Paul, Gregg could have written any of this code. He didn't
because he was answering the immediate question rather than the
implied request "teach me macro principles".
Still more solutions worth a look are
%macro catdata1 ( pref= , out=) ;
%local i ; %* always declare macro variables local! ;
data &out ;
set
%do i = 1 %to %sysfunc(month("&sysdate"d)) ;
&pref&i /* NO SEMI-COLON */
%end ;
;
run ;
%mend catdata1 ;
%catdata1 ( pref=subro.mydata , out=subro.result )
and
%macro catdata2 ( pref= , out=) ;
%local i list ; %* always declare macro variables local! ;
%do i = 1 %to %sysfunc(month("&sysdate"d)) ;
%let list = &list &pref&i ;
%end ;
data &out ;
set &list ;
run ;
%mend catdata2 ;
%catdata2 ( pref=subro.mydata , out=subro.result )
CATDATA1 illustrates a common answer, but one passed up by Gregg and
Paul because they are too good. So what is wrong with it? 1) It is
harder to read. 2) The solution is to tied even more strongly to your
problem and therefore almost inapplicable to any other use. But it
is important, because many people never graduate from this style of
coding, so one has to get use to macro instructions nested inside
SAS code. Someday, not too soon, you might ask yourself why it is
possible to have the semi-colons inside the SET statement.
CATDATA2 is somewhat better in style, but it still suffers from tying
the code too close to your problem and failing to analyze that a list
making macro will lead to a much more useful tool. On the other hand,
using this style makes it easier to separate the real macro CAT hiding
in this code.
I have not discussed all the answers given or possible. You might
try to analyze what is good about each of them and what is the cost
in using them. Whether the code is a good answer or not lies in part,
where you are coming from, and where you want to get to. But all the
answers indicate techniques that are worth learning and applying in
the appropriate situations.
Incidentally, I created my test data with
data %mac(pref=w,n=4) ;
x = 1 ;
run ;
A good question to ask in evaluating the answers given - Which ones can
be used to create the test data? Most macros will not have this property,
so it is not a general question, but if a macro does have this property
it is a good indication that something worth understanding is going on.
I changed your names to suit me, in each example consider the amount of
work needed to change the names. In general, when writing macro code,
there is a macro author and a macro consumer, who does not have to modify
the macro author's code. For each example, identify the author's code
versus the consumer's code and then consider when changes to the authors
code is required for a consumer's change. (Often the consumer is the
macro author, but it is always wise to distinguish them anyway because
it will make you a better macro author.)
I hope this gives you a little idea of what is involved in writing
good macro code and why a thorough grounding in SAS and programming
are needed. The principles discussed here are probably only learned
via the study of many examples.
Ian Whitlock
===================
Date: Sun, 30 Apr 2006 08:04:09 -0700
Reply-To: lei <leiliang217@YAHOO.COM>
Sender: "SAS(r) Discussion"
From: lei <leiliang217@YAHOO.COM>
Organization: http://groups.google.com
Subject: How to automatically change data name in SET?
Comments: To: sas-l
Content-Type: text/plain; charset="iso-8859-1"
Hi everyone,
I have a question. I have a few datasets, say mydata1, mydata2,
mydata3, till mydata&M where M is dymically changing according to
month. For example in March M=3, in April M=4.
So I wrote a program like this but got error message
DATA SUBRO.RESULT;
DO I = 1 TO &m;
SET SUBRO.MYDATA&I;
END
RUN;
The effect I want to achieve is that, for example, for m = 3
DATA SUBRO.RESULT;
SET SUBRO.MYDATA1;
SET SUBRO.MYDATA2;
SET SUBRO.MYDATA3;
RUN;
for m = 4, it becomes
DATA SUBRO.RESULT;
SET SUBRO.MYDATA1;
SET SUBRO.MYDATA2;
SET SUBRO.MYDATA3;
SET SUBRO.MYDATA4;
RUN;
Is there any way to do this?
Thank you!