Date: Wed, 15 Oct 2008 10:45:02 -0500
Reply-To: "./ ADD NAME=Data _null_," <iebupdte@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "./ ADD NAME=Data _null_," <iebupdte@GMAIL.COM>
Subject: Re: Inserting date
In-Reply-To: <b7a7fa630810150812u68a62eb5r81a94ed02cd2bf28@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
It would take many observations to show a performance difference.
Using retain would be similar to optimizing the code such that the
assignment is only executed one time.
data;
date = '14OCT08'D;
do until(eof);
set;
end;
run;
but using retain is easier
data;
retain date '14oct08'd;
set;
run;
If the input data set already contains the variable "data" retain or
the coded input loop would not work. You would want to DROP date.
Plus it makes you look like you know what you're doing. Similarly when you see
data a;
set lib.a;
run;
proc sort data=a
...
you know the user person who wrote it doesn't know what they're doing.
Another one I see all the time when I look at my colleagues programs
is what I call "keeping on the wrong side". For example in clinical
trials programing the data you need to access typically have many more
variables than you need for a given summary. Also, clinical trials
data both raw data and CDISC data are loaded with redundant
unnecessary variables, but don't' get me started on that.
I typically see.
data a;
set cdisc.LB;
keep subjid lbtestcd lbstresn;
run;
proc sort;
by lbtestcd;
run;
where a much more efficient program would be
proc sort
data=cdisc.lb
(
keep=subjid lbtestcd lbstresn
/*perhaps WHERE= here for subset*/
)
out=work.a;
by lbestcd;
run;
This example is completely contrived but it illustrates my point
fairly well. You want to keep on the INPUT side just the variables
you need. You can get a lot of programming done on the INPUT side
using KEEP RENAME and WHERE.
It is more efficient and it makes you look like you know what you're doing.
On 10/15/08, Joe Matise <snoopy369@gmail.com> wrote:
> What is the substantive reason to use RETAIN instead of assignment? Don't
> both give the same result at the end of the day? Is there a processing time
> difference or something?
>
> -Joe
>
>
> On Wed, Oct 15, 2008 at 9:54 AM, ./ ADD NAME=Data _null_,
> <iebupdte@gmail.com> wrote:
> > You mean the mistake of doing this at all, or the mistake where you
> > leave off the D on the date constant or the one where you use
> > assingment instead of RETAIN.
> >
> > data need;
> >
> >
> >
> >
> >
> >
> > On 10/15/08, Randy <randistan69@hotmail.com> wrote:
> > > I want to insert a date column into the entire dataset. My code is:
> > >
> > > data need; set have ;
> > > format date_one date9. ;
> > > date_one = '14OCT2008' ;
> > > run;
> > >
> > > what is the mistake?
> > >
> > > Randy
> > >
> >
>
>