|Date: ||Fri, 26 Sep 2003 04:58:20 -0400|
|Reply-To: ||Ben Powell <ben.powell@CLA.CO.UK>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||Ben Powell <ben.powell@CLA.CO.UK>|
|Subject: ||Re: Data Step Question|
Thanks Matt and others. I expect I introduced an error in naming variables
hence why the datastep would not work and why I broke it up - making it
inefficient, though not so much of an issue on datasets of less than 100k
rows - before finally getting round to putting it back together here!
Thanks also for the tip on the Datastep Manual -- can you suggest a good
one, preferably that would be available from the UK as otherwise I'd be
paying express US postage from fatbrain.com or somesuch, which frankly
spoils the party,
Date: Thu, 25 Sep 2003 14:01:50 -0700
From: m n <iced_phoenix@YAHOO.COM>
Subject: Re: Data Step Question
ben.powell@CLA.CO.UK (Ben Powell) wrote in message
> I'm not clear on what can be done in any single datastep and would
> appreciate any pointers.
> Say for example I want to copy a dataset to my work lib, rename a
> variable and keep that and one other variable. Because in the past I
> have found that sometimes a step won't action unless there is a run
> command after it I would break this job into 3 seperate data steps,
> which is quite repetative. Is there a rule of thumb for when a new
> datastep is needed and how many steps can be included in a datastep?
> data a;
> set lib.a;run;
> data a (rename=(var1=var2));
> set a;run;
> data a;
> set a;
> keep var2 var3;
It seems that there's a misunderstanding of the data step at work here.
Reading any good SAS Data Step programming manual will help; in the mean
time, perhaps it will help to note a couple properties of the data step:
* Each data step is a separate program; it is compiled and executed
separately from every other step in your SAS job. Thus, once you start
thinking of the data step as a separate program in its own right, you can
start to exploit its tremendous power.
* Each data step is an implicit loop. A SAS Data Step manual will explain
A LOT about this fundamental concept.
In the meantime, other posters solved your problem using data set options.
To give you some further insight, the same thing could be accomplished with:
set lib.a (rename (var1=var2));
keep var2 var3;
Picture the above data step as a loop, each iteration reading in one
observation from lib.a (via the set statement) and simultaneously renaming
var1 to var2, then outputting only var2 and var3. You can do a tremendous
amount of other processing in this implicit loop; each iteration of the
loop will carry out the processing on just the current observation.
I hope that helps,