LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2006, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 30 Apr 2006 20:19:07 +0000
Reply-To:     iw1junk@COMCAST.NET
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Ian Whitlock <iw1junk@COMCAST.NET>
Subject:      Re: How to automatically change data name in SET?
Comments: cc: lei <leiliang217@YAHOO.COM>, toby dunn <tobydunn@HOTMAIL.COM>,
          "Gregg P. Snell" <sas-l@DATASAVANTCONSULTING.COM>,
          "Paul M. Dorfman" <sashole@BELLSOUTH.NET>

Subject: How to automatically change data name in SET? Summary: A short lesson in the principles of writing macros Respondent: Ian Whitlock iw1junk@comast.net

Lei,

I see you have gotten some excellent answers, particularly Paul's, which hits the points I would make. This question and your other request for a macro book recommendation indicate that it may be worth a detailed response.

To repeat Paul's concern, it is very unlikely that you want something like

> ... for m = 3 > DATA SUBRO.RESULT; > SET SUBRO.MYDATA1; > SET SUBRO.MYDATA2; > SET SUBRO.MYDATA3; > RUN;

Why? The only time this would make sense is when

1) All the data sets had precisely the same length 2) Each data set has variables with different names 3) You want to combine the information for the Nth observation of each set into one observation on the resulting file.

Why? 1) The step would stop on the SET statement for the shortest data set when there were no more records. 2) If any variable name was common to two or more of these data sets, the value for the last one would win and all others got to the great pit in the sky. 3) If the first two conditions are met, then that is what your step will produce.

Now everyone has given you various methods producing something closer to what you may want. However, the above request is so unlikely, that it suggests that you have much to learn about SAS before it makes sense to look at a book on macro.

On the other hand, I cannot resist the impulse to teach some principles, so let's consider the problem again.

You have a step

data w ; set w1 ... wM ; run ;

You want to make something variable. First step - make it a macro variable, so we have

data w ; set &MACVAR ; run ;

Now the question becomes how to make the macro variable. Simplest is

%let macvar = w1 w2 w3 ... ;

where the dots are appropriately filled in. This is not automatic, but it is simple and probably best considering your present state of SAS knowledge.

The second possibility is

%let macvar = %mac(pref=w,n=4) ;

Now we are down to changing just one thing, the value of N. I do not see precisely this form so let's add it.

%macro mac ( pref= , n=1 ) ; %local i list ; %* always declare macro variables local! ; %do i = 1 %to &n ; %* a loop is required ; %let list = &list &pref&i ; %* concatenate list of names ; %end ; %* return, i.e. generate text ; &list /* NO semi-colon */ %mend mac ;

This macro has many purposes, not just the one you asked for. Hence it is worth writing. THAT'S IMPORTANT! Macro's for just one purpose are probably not worth writing unless the purpose is great enough to demand the macro in spite of its singular use.

There are two things here. One - isolate the change in a variable. Two - separate the means of creating the value from the consuming code. Finally, we don't need the variable in this case because the macro contains no SAS code other than the desired text. It would be better to write

data w ; set %mac(pref=w,n=4) ; run ;

Now the question becomes - what does automate mean? And, what does it cost? Everything in programming, in fact everything in life, gives you something and costs you something. The trick is to get at least what it costs. For example, very few SAS programmers would write

data _null_ ; ans = 2 + 3 ; put ans = ; run ;

to find the sum of 2 and 3. Why? The cost of even typing these 4 lines is too great for almost all of them.

What about the first part of the question? One interpretation of automate would be - the computer knows the date and in principle the month number so we could assign N with an appropriate expression. What is the cost? Well, you couldn't run the program on May 1, and have it generate only 4 elements in the list. You couldn't repeat a run you made in March when the boss requests that you use exactly the same code. Now we can compare,

Is it more important to have the simplicity of automate? Or to be removed from the responsibility for the value of N?

Assuming the convenience of automatic month is greater, we could code

%let n = %sysfunc(month("&sysdate"d)) ; data w ; set %mac(pref=w , n=&n) ; run ;

What if we need to run with the march assumption? The first line becomes

%let n = 3 ; * %sysfunc(month("&sysdate"d)) ;

So a very minor change is needed, i.e. we got our convenience and have to pay very little when more flexibility is required.

Compare that with Paul's corresponding code.

> %macro cat (lib=, mem= ) ; > %local i ; > %do i = 1 %to %sysfunc (month(%sysfunc (date()))) ; > &lib..&mem&I > %end ; > %mend cat ; > > data subro.result ; > set %cat (lib=work, mem=mydata) ; > run ;

To run in March, you would have to discard the macro call by changing the inner line to

set mydata1 mydata2 mydata3 ; * %cat (lib=work, mem=mydata) ;

The cost is higher because you have to locate the line and have more to type. But even these little things add up as the number of lines in a program increases. When you add the fact that MAC can be used in making many more lists, I see a clear advantage to MAC. So why didn't Paul do it? The answer isn't "Because Paul doesn't think about these things." It is because Paul was giving a simple answer to your question and the only cost to Paul was the time spent typing the answer.

How does this fit with Gregg's code

> proc sql noprint; > select memname into :datasets separated by " " > from sashelp.vtable > where libname="SUBRO" and memname like "MYDATA%"; > quit; > > %put datasets=&datasets; > > data subro.result; > set &datasets; > run;

Well it is back to first principles. %LET isn't the only way to create a macro variable, and Gregg has given you an example of a very valuable one. He also assumed a different interpretation of what automate means. His code solves the problem of running on May 1 when the May data set isn't ready by using all data set names of a certain form in the given directory. This could be more appropriate the date interpretation than Paul made, or it could cause problems when somebody added an unexpected data set with a name like "MYDATA%". You should also consider the cost of trying to run this code as if it were March. Again, like Paul, Gregg could have written any of this code. He didn't because he was answering the immediate question rather than the implied request "teach me macro principles".

Still more solutions worth a look are

%macro catdata1 ( pref= , out=) ; %local i ; %* always declare macro variables local! ; data &out ; set %do i = 1 %to %sysfunc(month("&sysdate"d)) ; &pref&i /* NO SEMI-COLON */ %end ; ; run ; %mend catdata1 ; %catdata1 ( pref=subro.mydata , out=subro.result )

and

%macro catdata2 ( pref= , out=) ; %local i list ; %* always declare macro variables local! ; %do i = 1 %to %sysfunc(month("&sysdate"d)) ; %let list = &list &pref&i ; %end ; data &out ; set &list ; run ; %mend catdata2 ; %catdata2 ( pref=subro.mydata , out=subro.result )

CATDATA1 illustrates a common answer, but one passed up by Gregg and Paul because they are too good. So what is wrong with it? 1) It is harder to read. 2) The solution is to tied even more strongly to your problem and therefore almost inapplicable to any other use. But it is important, because many people never graduate from this style of coding, so one has to get use to macro instructions nested inside SAS code. Someday, not too soon, you might ask yourself why it is possible to have the semi-colons inside the SET statement.

CATDATA2 is somewhat better in style, but it still suffers from tying the code too close to your problem and failing to analyze that a list making macro will lead to a much more useful tool. On the other hand, using this style makes it easier to separate the real macro CAT hiding in this code.

I have not discussed all the answers given or possible. You might try to analyze what is good about each of them and what is the cost in using them. Whether the code is a good answer or not lies in part, where you are coming from, and where you want to get to. But all the answers indicate techniques that are worth learning and applying in the appropriate situations.

Incidentally, I created my test data with

data %mac(pref=w,n=4) ; x = 1 ; run ;

A good question to ask in evaluating the answers given - Which ones can be used to create the test data? Most macros will not have this property, so it is not a general question, but if a macro does have this property it is a good indication that something worth understanding is going on.

I changed your names to suit me, in each example consider the amount of work needed to change the names. In general, when writing macro code, there is a macro author and a macro consumer, who does not have to modify the macro author's code. For each example, identify the author's code versus the consumer's code and then consider when changes to the authors code is required for a consumer's change. (Often the consumer is the macro author, but it is always wise to distinguish them anyway because it will make you a better macro author.)

I hope this gives you a little idea of what is involved in writing good macro code and why a thorough grounding in SAS and programming are needed. The principles discussed here are probably only learned via the study of many examples.

Ian Whitlock =================== Date: Sun, 30 Apr 2006 08:04:09 -0700 Reply-To: lei <leiliang217@YAHOO.COM> Sender: "SAS(r) Discussion" From: lei <leiliang217@YAHOO.COM> Organization: http://groups.google.com Subject: How to automatically change data name in SET? Comments: To: sas-l Content-Type: text/plain; charset="iso-8859-1" Hi everyone, I have a question. I have a few datasets, say mydata1, mydata2, mydata3, till mydata&M where M is dymically changing according to month. For example in March M=3, in April M=4. So I wrote a program like this but got error message DATA SUBRO.RESULT; DO I = 1 TO &m; SET SUBRO.MYDATA&I; END RUN; The effect I want to achieve is that, for example, for m = 3 DATA SUBRO.RESULT; SET SUBRO.MYDATA1; SET SUBRO.MYDATA2; SET SUBRO.MYDATA3; RUN; for m = 4, it becomes DATA SUBRO.RESULT; SET SUBRO.MYDATA1; SET SUBRO.MYDATA2; SET SUBRO.MYDATA3; SET SUBRO.MYDATA4; RUN; Is there any way to do this? Thank you!


Back to: Top of message | Previous page | Main SAS-L page