| Date: | Fri, 29 Aug 2003 11:24:52 -0400 |
| Reply-To: | Ian Whitlock <WHITLOI1@WESTAT.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Ian Whitlock <WHITLOI1@WESTAT.COM> |
| Subject: | Re: proc format from dataset for multiple variables. |
|
| Content-Type: | text/plain |
|---|
Pankaj,
Here is some code to help you get started. Note I changed the structure of
the control file per Howard's directions.
/* test data */
data w ;
do obs = 1 to 100 ;
num_b = 300 * ranuni ( 14231 ) ;
output ;
end ;
run ;
/* get mean */
proc sql noprint ;
select mean(num_b) into :mean_b
from w ;
run ;
/* make format data */
data fmtdata ;
retain mean_b &mean_b ;
input Fmtname $ Start $ end $ label $ hlo $ ;
if upcase(fmtname) = "NUM_BFMT" then
label = put ( mean_b , best12. ) ;
cards ;
Num_Bfmt . . . .
Num_Bfmt low 100 . l
Num_Bfmt 300 high . h
;
/* make the format */
proc format cntlin = fmtdata fmtlib ;
run ;
/* prove of the pudding */
data q ;
set w ;
newb = input(put(num_b,num_bfmt12.6), best12.) ;
run ;
IanWhitlock@westat.com
-----Original Message-----
From: Pankaj Singal [mailto:pankaj.singal@WELLSFARGO.COM]
Sent: Friday, August 29, 2003 10:54 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: proc format from dataset for multiple variables.
When I tried to implement the process for one numeric and character
variable, the character variable runs fine, but I am facing problems with
the numeric variables. How to go about the same.
This is what I intend to do: (I have numeric variable num_b that can have
legal values b/w 100-300. and if in my input dataset the value for num_b is
in the range ([100-300]), we do not change anything but if it is not, we
change as per the following.
Control_file
Fmtname Start Label
Num_Bfmt . mean(num_b)
Num_Bfmt low-100 mean(num_b)
nu_Bfmt 300-high mean(num_b)
How do I make a caro that calculates the mean and put it in the
control_file. Also, I am not sure how to incorporate the range that is
valid. i.e. if the num_b = 105, then my variable in "test" dataset should be
105. How to get that in without using an if statement in data step.
Any suggestions, would be greatly appreciated.
Thanks
Howard_Schreier@ITA.DOC.GOV (Howard Schreier) wrote in message
news:<200308282045.h7SKjid31677@listserv.cc.uga.edu>...
> Seems like a good approach.
>
> I suggest that you you make your control dataset conform as much as
> possible to the structure and naming conventions of a SAS PROC FORMAT
> CNTLIN dataset.
>
> On Wed, 27 Aug 2003 09:23:35 -0700, Pankaj Singal
> <pankaj.singal@WELLSFARGO.COM> wrote:
>
> >I have a huge dataset(sample) (>100 variables/fields eg, A B C D E F
> >...A1..A75...). I am trying to create new dateset (test) with all
> >valid entries for various fields in "sample" dataset. I created a
> >separate dataset (control_dataset) that has 5 columns, which will
> >help me create the proc formats to create the "test" dataset with
> >valid entries.
> >
> >Control_dataset
> >
> >Var_name old_value Operator new_value Description
> >A "1" EQ "WALK" "1 changes to Walk"
> >A "2" EQ "RUN" "2 changes to RUN"
> >A other else "INVALID" "Invalid entry"
> >B [100,300] EQ [100,300] "valid range 100-300"
> >B other else mean "invalid value replaced by
mean"
> >.
> >.
> >.
> >?..
> >>>>This table lists each valid entry for the variable. E.g. A can
> >>>>have
> >valid values for 1 or 2, if it is not 1 or 2 then the entry is
> >invalid.
> >
> >>>>Other includes invalid entries for that variable and assuming that
> >it will pick up the missing values too.
> >
> >There are both character and numeric fields that needs to be
> >validated and put in "test" dataset. I need to get the new_value in
> >the variable and also the description field associated with the same.
> >(There are around 75 fields that needs to be validated).
> >
> >I am trying to create the proc formats from control_dataset to create
> >"test" dataset. that would have the
> >
> >I am not sure whether this is an efficient idea or not, but we need
> >to implement the same.
> >
> >Please help.
|