Date: Tue, 9 Feb 2010 14:39:21 -0600
Reply-To: "White, Svend A." <SvendW@HEALTH.OK.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "White, Svend A." <SvendW@HEALTH.OK.GOV>
Subject: Re: Saving values in a format in text form
In-Reply-To: A<ce1fb7451002091227x6288dbbu519c9eb49e792ef9@mail.gmail.com>
Content-Type: TEXT/plain; charset="us-ascii"
Aha! I should have known there was an infinitely easier way to do this.
I had noticed those options in some papers I'd downloaded, but didn't
put two and two together. Thank you.
Svend
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
> Data _null_;
> Sent: Tuesday, February 09, 2010 2:27 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: Saving values in a format in text form
>
> Have a look at the control data set created by the option CNTLOUT=
>
> proc format
> library = DEVDATA7.formats CNTLOUT=CONTROL;
> select mdc; /* Or get them all in one data set by omitting
> SELECT */
> run;
>
> Once you have this data set you can add values or otherwise modify the
> formats as needed. Then use CNTLIN to recreate the formats. No loss
> of data will occur.
>
> proc format
> library = DEVDATA7.formats CNTLIN=CONTROL;
> run;
>
>
> On 2/9/10, White, Svend A. <SvendW@health.ok.gov> wrote:
> > I need to tweak some formats that are currently in use and as part
of
> > the process am extracting all the existing formats and creating a
new
> > setup script (which hopefully won't get lost this time).
> >
> > Here's the process I'm using.
> > 1. Execute a bunch of PROC FORMAT statements like the above, one for
> > each existing format.
> > 2. Copy & paste all of the output into a text editor.
> > 3. Use regular expressions to transform all that into a series of
> VALUE
> > sets that can be plugged into a PROC FORMAT step to recreate the
> format
> > collection.
> >
> > This inelegant process is working fine except for the truncation of
> > longer values in the output window, which necessitates manual
> editing.
> >
> > Hence my three questions:
> >
> > 1) Is there a smarter way to do this? I know that in an RDBMS one
> > usually employ dynamic SQL, for example, to generate data design
> layer
> > SQL for populating a table. It's tedious, but sometimes it can save
> you
> > a huge amount of work. Is there an analogous option here for
creating
> a
> > setup script out of an existing format collection?
> >
> > 2) Am I right in thinking that when one adds a new value set to a
> format
> > collection one must re-create them all? My impression is that one
> cannot
> > just add a single value set (e.g., adding county to a format
> collection
> > already containing, city and state).
> >
> > 3) Assuming that this is how I have to do things, can one modify the
> > output of this code so none of the values are truncated to make them
> fit
> > on the screen (perhaps by outputting them to a text file)?
> >
> > proc format
> > library = DEVDATA7.formats fmtlib;
> > select mdc;
> > run;
> >
> >
> > Thanks.
> >
> >
> >
> > > -----Original Message-----
> > > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf
> Of
> > > Mike Zdeb
> > > Sent: Tuesday, February 09, 2010 1:09 PM
> > > To: SAS-L@LISTSERV.UGA.EDU
> > > Subject: Re: How to attach a numeric index to column names for use
> in
> > a
> > > loop?
> > >
> > > hi ... OK, you win !!!
> > >
> > >
> > > I tried the loop versus your code with a large data set
> > > (well large length-wise ... 20 million obs, if not width-wise ...
> 10
> > > variables)
> > >
> > > loop ... multiple PROC FREQs on 10 single variables (about 5 secs
> cpu
> > > each)
> > > ods output ... versus one PROC FREQ on all 10 variables (about 20
> secs
> > > cpu)
> > >
> > > the elapsed time difference was even more in favor of the "one
time
> > > PROC FREQ"
> > >
> > > the post-processing code is neat (coalescec to find the data ...
> nice)
> > >
> > >
> > > so, in the words of Emily Litella ... "Never mind."
> > >
> > >
> > > --
> > > Mike Zdeb
> > > U@Albany School of Public Health
> > > One University Place
> > > Rensselaer, New York 12144-3456
> > > P/518-402-6479 F/630-604-1475
> > >
> > > > I'm not convinced that creating a data set for each variable is
> ever
> > > > going to be an advantage when the desired result is a single
> > > "stacked"
> > > > data set. Depending on what the desired result finally needs to
> be
> > > > there are a easy routes to the variable values. Here is one
> using
> > > the
> > > > F_ variables.
> > > >
> > > > ods listing close;
> > > > proc freq data=sashelp.class;
> > > > ods output onewayfreqs=onewayfreqs;
> > > > run;
> > > > Ods listing;
> > > > data results(keep=variable label value Frequency Percent Cum:);
> > > > length variable $32 label $256 value $64;
> > > > set oneWayFreqs;
> > > > variable = scan(table,-1);
> > > > label = vlabelX(variable);
> > > > value = coalesceC(of F_:);
> > > > run;
> > > > proc print;
> > > > run;
> > > >
> > > > On 2/9/10, Mike Zdeb <msz03@albany.edu> wrote:
> > > >> hi ... if you want to go this route, here's another idea
> > > >> with just the OUT= optiion in PROC FREQ
> > > >> (I think that you end up with the same data set)
> > > >>
> > > >> and if you weren't skipping variables, put the SQL part inside
> the
> > > macro and add
> > > >> arguments to the macro call ... libname and data set name
> > > >>
> > > >>
> > > >>
> > > >> proc sql noprint;
> > > >> select name, count(*)
> > > >> into :varlist separated by '/', :nvars
> > > >> from dictionary.columns
> > > >> where libname eq 'SASHELP' and memname eq 'CLASS' and
> upcase(name)
> > > ne 'NAME';
> > > >> quit;
> > > >>
> > > >> %macro get_freqs;
> > > >> proc datasets lib=work nolist;
> > > >> delete results_all;
> > > >> quit;
> > > >>
> > > >> %do j=1 %to &nvars;
> > > >> %let v=%scan(&varlist,&j,'/');
> > > >> proc freq data=sashelp.class;
> > > >> table &v / noprint out=results;
> > > >> run;
> > > >>
> > > >> data results;
> > > >> length variable $32 value $15;
> > > >> retain variable "&v" ;
> > > >> set results;
> > > >> value = cat(&v);
> > > >> if vtype(&v) eq 'N' then value=right(value);
> > > >> drop &v;
> > > >> run;
> > > >>
> > > >> proc append base=results_all data=results;
> > > >> run;
> > > >> %end;
> > > >>
> > > >> proc datasets lib=work nolist;
> > > >> delete results;
> > > >> quit;
> > > >> %mend;
> > > >>
> > > >> %get_freqs;
> > > >>
> > > >> --
> > > >> Mike Zdeb
> > > >> U@Albany School of Public Health
> > > >> One University Place
> > > >> Rensselaer, New York 12144-3456
> > > >> P/518-402-6479 F/630-604-1475
> > > >>
> > > >> > And here is how to to multiple procedures with variables;
this
> > > demonstrates getting a frequency on every variable i the set; but
> > other
> > > >> procedures
> > > >> > could be placed inside the loop as well. Note the %do loop
> has
> > to
> > > be inside a macro wrapper for this to work.
> > > >> >
> > > >> > ods output variables=variables;
> > > >> > proc contents data=setname;
> > > >> > run;
> > > >> >
> > > >> > proc sql noprint;
> > > >> > create table variables_to_process as
> > > >> > select variables.variable
> > > >> > from variables
> > > >>
> > > >> > where variable not in ('ID')
> > > >> > order by num;
> > > >> > quit;
> > > >> >
> > > >> > data variables_to_process;
> > > >> > set variables_to_process;
> > > >> > runnum + 1;
> > > >> > run;
> > > >> >
> > > >> > %macro get_freqs;
> > > >> >
> > > >> > data results_all;
> > > >> > informat variable $50. value $50. frequency best32. percent
> > > best32.
> > > >> > cumfrequency best32. cumpercent best32.;
> > > >> > stop;
> > > >> > run;
> > > >> >
> > > >> > proc sql noprint;
> > > >> > select max(runnum) into :max from variables_to_process;
> > > >> > quit;
> > > >> >
> > > >> > %do i=1 %to &max;
> > > >> > proc sql noprint;
> > > >> > select variable into :variable
> > > >> > from variables_to_process
> > > >> > where runnum=&i;
> > > >> > quit;
> > > >> > %put processing variable: &variable;
> > > >> >
> > > >> > ods output onewayfreqs=onewayfreqs;
> > > >> > proc freq data=setname;
> > > >> > tables &variable;
> > > >> > run;
> > > >> >
> > > >> > data results;
> > > >> > informat variable $50. value $50. frequency best32. percent
> > > best32.
> > > >> > cumfrequency best32. cumpercent best32.;
> > > >> > set onewayfreqs;
> > > >> > variable="&variable";
> > > >> > value=&variable;
> > > >> > keep variable value frequency percent cumfrequency
cumpercent;
> > > >> > run;
> > > >> >
> > > >> > proc append base=results_all data=results;
> > > >> > run;
> > > >> > %end;
> > > >> >
> > > >> > %mend get_freqs;
> > > >> >
> > > >> > %get_freqs;
> > > >> >
> > > >> >
> > > >> >
> > > >> > --- mlhoward@avalon.net wrote:
> > > >> >
> > > >> > From: Mary <mlhoward@avalon.net>
> > > >> > To: Church <H.J.Plat@UVA.NL>
> > > >> > Cc: <SAS-L@LISTSERV.UGA.EDU>
> > > >> > Subject: Re: How to attach a numeric index to column names
for
> > use
> > > in a loop?
> > > >> > Date: Tue, 9 Feb 2010 08:33:56 -0800
> > > >> >
> > > >> > You can save the variable names to a data set; this is one
> way:
> > > >> >
> > > >> > ods output variables=variables;
> > > >> > proc contents data=setname;
> > > >> > run;
> > > >> >
> > > >> > Then you can select the variables needed into a macro
> variable:
> > > >> >
> > > >> > proc sql noprint;
> > > >> > select variable into :variable_list
> > > >> > from variables
> > > >> > separated by ' '
> > > >> > where variable not in ('id');
> > > >> > quit;
> > > >> > %put &variable_list;
> > > >> >
> > > >> > data setname2;
> > > >> > set setname;
> > > >> > array vararray{*} &variable_list;
> > > >> > do i=1 to dim(vararray);
> > > >> > if vararray[i]= .... then vararray[i]=....;
> > > >> > end;
> > > >> > run;
> > > >> >
> > > >> >
> > > >> > If you need to do something outside of a data step- then
write
> > > back- I usually do this in a macro and *pop* the variable list off
> to
> > > process
> > > >> each
> > > >> > variable.
> > > >> >
> > > >> >
> > > >> > -Mary
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --- H.J.Plat@UVA.NL wrote:
> > > >> >
> > > >> > From: Church <H.J.Plat@UVA.NL>
> > > >> > To: SAS-L@LISTSERV.UGA.EDU
> > > >> > Subject: Re: How to attach a numeric index to column
> names
> > > for use in a loop?
> > > >> > Date: Tue, 9 Feb 2010 03:18:21 -0800
> > > >> >
> > > >> > On 9 feb, 12:03, Chris Jones <chris...@gmail.com> wrote:
> > > >> >> On 9 Feb, 07:52, Church <H.J.P...@uva.nl> wrote:
> > > >> >>
> > > >> >> > Hi,
> > > >> >>
> > > >> >> > I have a dataset of about 100 columns.
> > > >> >> > I use a loop where each time I need a different column. I
> > don't
> > > want
> > > >> >> > to write the 100 column names in the code, but rather do
it
> in
> > > some
> > > >> >> > more standardized way, for example by attaching an index
> > number
> > > to
> > > >> >> > each column and then referring to this column using an
> index
> > > variable
> > > >> >> > (so for example do i=1 to n; and than selecting column i
> each
> > > time).
> > > >> >> > Does anyone know how to do this?
> > > >> >>
> > > >> >> > Thanks.
> > > >> >>
> > > >> >> Use arrays.
> > > >> >>
> > > >> >> In your initial data, create the columns with numerical
> > suffixes,
> > > e.g.
> > > >> >>
> > > >> >> data mydata ;
> > > >> >> array n{*} d1-d100 ; /* create variables d1 thru to d100
*/
> > > >> >> /* further processing */
> > > >> >> run ;
> > > >> >>
> > > >> >> data myprocess ;
> > > >> >> set mydata ;
> > > >> >> array n{*} d1-d100 ;
> > > >> >>
> > > >> >> do i = 1 to dim(n) ;
> > > >> >> /* to reference d1 use n{1}, d2 > n{2}, etc. */
> > > >> >> /* do something with each element */
> > > >> >> /* e.g. */
> > > >> >> n{i} = n{i} * 2 ;
> > > >> >> end ;
> > > >> >> run ;
> > > >> >
> > > >> > Thanks. That sounds good, but I have original variables with
> > > original
> > > >> > variable names that I first have to assign to the array
> elements
> > > in
> > > >> > this case. So then I still need to write the column name,
> don't
> > I?
> > > >> >
> > > >>
> > > >
> >
|