Date: Sun, 18 Dec 2005 23:48:23 -0800
Reply-To: RolandRB <rolandberry@HOTMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: RolandRB <rolandberry@HOTMAIL.COM>
Organization: http://groups.google.com
Subject: Re: Decoded variables (and how long can SAS stmts be).
In-Reply-To: <1134752057.640983.245430@o13g2000cwo.googlegroups.com>
Content-Type: text/plain; charset="iso-8859-1"
data _null_; wrote:
> Seems to me that if you just write the code to a file and then %inc you
> don't have to worry about long lines, long statements and CALL EXECUTE.
I don't like writing to a file and %inc'ing it. I'd rather it stayed in
working memory. The less files created, the better, as far as I am
concerned.
> The "generated" program is much easier to debug, and the consumer of
> the program and or output (new data sets) would find the "generated"
> program easier to understand.
>
> Plus you can easily generalize to the library level.
If you have a macro that can decode a single dataset then it is better
to keep it like that and call it for every library member, if you want
to convert a whole library. What if you really wanted to run it on a
single dataset in a library? Would you have a second utility to do it?
It is better if you can split tasks up that way. It means you end up
with more useful utilities that can be used in circumstances that may
arise in the future and it becomes easier to approach complex tasks.
Also, are you cancelling the formats on the coded variables? I can't
tell from your code. I think that should be an option if it isn't
already there.
What I am going to do this morning is write an "fsvdc" utilitty (fsv
decode) so that I can browse a dataset in a library and see the decoded
values alongside the coded values (unformatted). This is where
splitting a task into its components comes in handy. I hadn't thought
to do this when I set out to write the macro but using it this way came
to me later and is more useful to me.
> Example "GENERATED" Program:
>
> Data WORK.CLASS ;
> Attrib Name Length=$8 ;
> Attrib Sex Length=$1 ;
> Attrib Sex__D Length=$6 ;
> Attrib Age Length= 8 ;
> Attrib Height Length= 8 ;
> Attrib Weight Length= 8 ;
> Attrib group Length= 8 LABEL="Behavior group" ;
> Attrib group__D Length=$16 LABEL="Behavior group" ;
> Set WORK.CLASS ;
> Sex__D = putC(Sex,vformat(Sex));
> group__D = putN(group,vformat(group));
> Run;
> Data WORK.CLASS2 ;
> Attrib Name Length=$8 ;
> Attrib Sex Length=$1 ;
> Attrib Sex__D Length=$6 ;
> Attrib Age Length= 8 ;
> Attrib Height Length= 8 ;
> Attrib Weight Length= 8 ;
> Attrib group Length= 8 LABEL="Behavior group" ;
> Attrib group__D Length=$16 LABEL="Behavior group" ;
> Set WORK.CLASS2 ;
> Sex__D = putC(Sex,vformat(Sex));
> group__D = putN(group,vformat(group));
> Run;
>
> Program that wrote the program above: I did not adequatly address the
> issues of name conflicts that could arrise or how to determine which
> variables need to be decoded.
>
> proc format;
> value $sex 'F'='Female' 'M'='Male';
> value group 1='Good kids' 2='Not so good kids';
> value vt 1=' ' 2='$';
> value dctype 1='N' 2='C';
> run;
> data work.class work.class2;
> set sashelp.class;
> group = rantbl(12345,.4,.6);
> format group group. sex $sex. age f2. weight 5.1;
> label group='Behavior group';
> run;
> proc contents varnum fmtlen mtype=data
> data = work._all_
> out = work.contents(keep=libname memname name varnum type
> format formatl length label);
> run;
> proc sort data=work.contents;
> by libname memname varnum;
> run;
> data work.contents;
> set work.contents;
> outlibname = libname; /* change this for different output library */
> output;
> if format not in('DATE','DATETIME','BEST','TIME',' ','$','F','Z')
> then do;
> /* still need better test for decode vars */
> varnum = varnum + .5;
> oname = name;
> name = trim(name)||'__D'; /* this is not adaquate */
> decodetype = type;
> type = 2;
> length = formatl;
> output;
> end;
> filename pgm '_program_.sas';
> data _null_;
> file pgm;
> do until(last.memname);
> set work.contents end=eof1;
> by libname memname;
> if first.memname then put 'Data ' outlibname +(-1) '.' memname
> ';';
> put +3 'Attrib ' name $16. 'Length=' type vt. length @;
> if not missing(label) then put label=$quote100. @;
> put ';';
> end;
> put +3 'Set ' libname +(-1) '.' memname ';';
> do until(last.memname);
> set work.contents end=eof1;
> by libname memname;
> if not missing(decodetype) then do;
> put +3 name $16. ' = ' @;
> put 'put' decodetype dctype1. '(' oname +(-1) ',vformat('
> oname +(-1) '));';
> end;
> if last.memname then put +3 'Run;';
> end;
> run;
> %inc pgm / source2;
>
>
>
> RolandRB wrote:
> > The length of the list variables can go as high as 32767 if need be so
> > you shouldn't have a problem no matter how many variables you have in
> > your datasets. Just up the value, currently at 2000 in the code, to
> > something higher than you think you will ever need.
|