|Date: ||Fri, 29 Aug 2003 11:24:52 -0400|
|Reply-To: ||Ian Whitlock <WHITLOI1@WESTAT.COM>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||Ian Whitlock <WHITLOI1@WESTAT.COM>|
|Subject: ||Re: proc format from dataset for multiple variables.|
Here is some code to help you get started. Note I changed the structure of
the control file per Howard's directions.
/* test data */
data w ;
do obs = 1 to 100 ;
num_b = 300 * ranuni ( 14231 ) ;
/* get mean */
proc sql noprint ;
select mean(num_b) into :mean_b
from w ;
/* make format data */
data fmtdata ;
retain mean_b &mean_b ;
input Fmtname $ Start $ end $ label $ hlo $ ;
if upcase(fmtname) = "NUM_BFMT" then
label = put ( mean_b , best12. ) ;
Num_Bfmt . . . .
Num_Bfmt low 100 . l
Num_Bfmt 300 high . h
/* make the format */
proc format cntlin = fmtdata fmtlib ;
/* prove of the pudding */
data q ;
set w ;
newb = input(put(num_b,num_bfmt12.6), best12.) ;
From: Pankaj Singal [mailto:pankaj.singal@WELLSFARGO.COM]
Sent: Friday, August 29, 2003 10:54 AM
Subject: Re: proc format from dataset for multiple variables.
When I tried to implement the process for one numeric and character
variable, the character variable runs fine, but I am facing problems with
the numeric variables. How to go about the same.
This is what I intend to do: (I have numeric variable num_b that can have
legal values b/w 100-300. and if in my input dataset the value for num_b is
in the range ([100-300]), we do not change anything but if it is not, we
change as per the following.
Fmtname Start Label
Num_Bfmt . mean(num_b)
Num_Bfmt low-100 mean(num_b)
nu_Bfmt 300-high mean(num_b)
How do I make a caro that calculates the mean and put it in the
control_file. Also, I am not sure how to incorporate the range that is
valid. i.e. if the num_b = 105, then my variable in "test" dataset should be
105. How to get that in without using an if statement in data step.
Any suggestions, would be greatly appreciated.
Howard_Schreier@ITA.DOC.GOV (Howard Schreier) wrote in message
> Seems like a good approach.
> I suggest that you you make your control dataset conform as much as
> possible to the structure and naming conventions of a SAS PROC FORMAT
> CNTLIN dataset.
> On Wed, 27 Aug 2003 09:23:35 -0700, Pankaj Singal
> <pankaj.singal@WELLSFARGO.COM> wrote:
> >I have a huge dataset(sample) (>100 variables/fields eg, A B C D E F
> >...A1..A75...). I am trying to create new dateset (test) with all
> >valid entries for various fields in "sample" dataset. I created a
> >separate dataset (control_dataset) that has 5 columns, which will
> >help me create the proc formats to create the "test" dataset with
> >valid entries.
> >Var_name old_value Operator new_value Description
> >A "1" EQ "WALK" "1 changes to Walk"
> >A "2" EQ "RUN" "2 changes to RUN"
> >A other else "INVALID" "Invalid entry"
> >B [100,300] EQ [100,300] "valid range 100-300"
> >B other else mean "invalid value replaced by
> >>>>This table lists each valid entry for the variable. E.g. A can
> >valid values for 1 or 2, if it is not 1 or 2 then the entry is
> >>>>Other includes invalid entries for that variable and assuming that
> >it will pick up the missing values too.
> >There are both character and numeric fields that needs to be
> >validated and put in "test" dataset. I need to get the new_value in
> >the variable and also the description field associated with the same.
> >(There are around 75 fields that needs to be validated).
> >I am trying to create the proc formats from control_dataset to create
> >"test" dataset. that would have the
> >I am not sure whether this is an efficient idea or not, but we need
> >to implement the same.
> >Please help.