Date: Fri, 3 Oct 1997 19:00:03 -0400
Reply-To: Anthony Ayiomamitis <ayiomamitis@IBM.NET>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Anthony Ayiomamitis <ayiomamitis@IBM.NET>
Subject: Re: Any Efficient Way?
Content-Type: text/plain; charset=us-ascii
SH Group wrote:
>
> Please help me with the following problems.
>
> I have a variable called "pressure" and I also have another 250
> variables
> from v1-v250.
> Now I want to create new variables in my dataset as follows:
>
> ratio1=v1/pressure*100;
> .
> .
> .
> ratio250=v250/pressure*100;
>
> Meanwhile, I want to rename all v1-v250 into x1-x250 in my dataset.
>
> I know there must be an efficient way to do wtat I need.
>
> Thanks for help.
>
Bill,
In a previous message I attempted to post two solutions to your
problem above and I made a mess in trying to do two things (i.e.
solutions) at once.
So, to clarify matters, I will do things one at a time by cutting
and pasting from the original message and a second message that I had
followed that attempted to clarify the errors of my first posting but
which added a new error (long day at work on Monday).
My first solution was:
data _null_;
file 'c:\rename.sas';
do i=1 to 250;
put 'v' i ' = x' i;
end;
run;
data newbill;
set oldbill;
array ratios ratio1-ratio250;
array vs v1-v250;
do over ratios;
ratios = vs / pressure * 100;
end;
rename
%include 'c:\rename.sas';;
run;
My second solution was:
data newbill (drop=v1-v250);
set oldbill;
array ratios ratio1-ratio250;
array vs v1-v250;
array xs x1-x250;
do over ratios;
ratios = vs / pressure * 100;
xs = vs;
end;
run;
I have received correspondence about the pros and cons of the two
attempted solutions. There is a bias towards the second solution above
which is the least of my preferences between the two of them.
The first solution, although involving an extra but brief data
step and creation of a temporary file, I feel is best as it simply
provides a mapping of the old variable names to the new ones and
represents the real way of renaming one or more variables.
The second solution, although more compact (one data step),
involves unnecessary processing in an indirect attempt to rename the
variables. I do not prefer this solution since for the example given, we
are performing 250 operations for each record. Therefore, if we have an
input dataset with a million records, we are looking to perform an
additional and unnecessary 250 million operations.
In contrast, the use of the generated RENAME statement employed by
the first solution bypasses this unnecessary and additional computing at
the expense of having the additional data step and temporary external
file.
Thoughts and opinions are welcome (and expected <g>).
Anthony.