Date: Thu, 24 Jul 2003 09:49:26 -0700
Reply-To: Dale McLerran <stringplayer_2@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dale McLerran <stringplayer_2@YAHOO.COM>
Subject: Re: PROC Transreg or Binary values
In-Reply-To: <bfom8m$1kn4$1@f04n12.cac.psu.edu>
Content-Type: text/plain; charset=us-ascii
Daryl,
The procedure GLMMOD is designed to generate binary variables from
a categorical predictor variable. You must create formatted
values of your variable in a datastep before running GLMMOD.
Thus, you would want to run something like:
data mydata;
length UnqKey 8 sex $ 1;
infile cards missover;
input UnqKey SEX;
cards;
1 M
2 F
3 F
4 M
5
6 M
7 F
;
proc format;
value $sex
'M',' ' = '1: M'
'F' = '2: F';
run;
data formatted / view=formatted;
set mydata;
fsex = put(sex,$sex.);
run;
ods listing close;
proc glmmod data=formatted
outparm=parm(where=(_colnum_>0))
outdesign=design(drop=sex0)
prefix=sex
zerobased;
class fsex;
model UnqKey = fsex;
run;
ods listing;
data out;
merge mydata(in=a)
design(in=b);
by UnqKey;
run;
proc print data=parm;
proc print data=design;
proc print data=out;
run;
The binary variables are contained in the design dataset. Note
that the first formatted value of sex takes the value 1 for
variable SEX1 while the second formatted value takes the value
1 for variable SEX2. If you have 200 variables for which you
need to generate binary variables, then you will probably want
to embed the processing in a macro, generating design data for
each and then merging all of the design datasets at the end of
the process. I don't have time to provide detail on construction
of such a macro. It should not be difficult. Alternatively,
you could could generate the design matrix for all variables
at once, but you cannot take advantage of the prefix option
to name the design matrix columns. You would have to parse
the PARM dataset to construct names for the columns in your
DESIGN dataset.
Dale
--- Daryl R Hoffman <daryl.hoffman@PSUALUM.COM> wrote:
> I need some assistance with regard to SAS please (before I am going
> to kick this machine to pieces)
>
> I have a variable, let's say SEX the possible values is as follows:
> M, F or MISSING.
>
>
> I now want to create a BINARY FIELD, with the names MALE, FEMALE and
> the MISSINGS should form part of the MALE field.
>
> Now I can write a normal data step statement with a SELECT STATMENT
> in it, but the problem is I have about 200 various variables and some
> of them will require up to 20 binary variables. Therefor, I am
> looking for a PROCEDURE or something that can do this for me.
>
> Proc Transreg does it, but it doesn't handle the MISSINGS.
>
> May I also ask where else can I post this question to, as I really
> need to solve this problem asap.
>
> --------------------------------------
> The data set BEFORE transformation:
>
> UnqKey SEX
> 1 M
> 2 F
> 3 F
> 4 M
> 5
> 6 M
> 7 F
>
> ------------------------------------
> The data set AFTER transformation
>
> UnqKey SEX MALE FEMALE
> 1 M 1 0
> 2 F 0 1
> 3 F 0 1
> 4 M 1 0
> 5 1 0
> 6 M 1 0
> 7 F 0 1
>
> ------------------------------------
>
> ANY HELP will be much appreciated.
>
>
> Kind regards
> Skouperd
>
> Please direct replies to: skouperd@mweb.co.za or I will forward them
> to
> him.
=====
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@fhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com