Date: Tue, 19 Apr 2011 09:30:32 -0500
Reply-To: Robin R High <rhigh@UNMC.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Robin R High <rhigh@UNMC.EDU>
Subject: Re: Proc Reg: Multiple Model Statements
In-Reply-To: <81F8139F381BE844AE05CA6525FF2AAE03188756@tpwd-mx9.tpwd.state.tx.us>
Content-Type: text/plain; charset="US-ASCII"
Warren,
Unrelated to PROC REG, but another missing data feature that can "getcha"
is with PROC TABULATE by entering variables with missing data which are
placed on the CLASS statement but not specified in a Table statement,
e.g., by embellishing your example with two categorical variables:
data test;
input a b X1 X2 Y1 Y2;
datalines;
1 1 1 . 2 5
1 1 2 . 5 6
. 1 3 2 7 7
2 2 4 3 8 8
2 2 5 4 11 12
;
run;
proc tabulate noseps ;
class a b;
var x1 ;
table b, x1*(n*f=4.0 mean*f=5.1) / rts=10;
run;
---------------------
| | X1 |
| |----------|
| | N |Mean |
|--------+----+-----|
|b | | |
|1 | 2| 1.5| * a missing value of variable a causes one record
to disappear
|2 | 2| 4.5|
---------------------
to get the correct computations, should add the "missing" option on the
PROC statement.
proc tabulate noseps missing;
class a b;
var x1 ;
table b, x1*(n*f=4.0 mean*f=5.1) / rts=10;
run;
---------------------
| | X1 |
| |----------|
| | N |Mean |
|--------+----+-----|
|b | | |
|1 | 3| 2.0|
|2 | 2| 4.5|
---------------------
The moral is that when accessing any procedure and specifying variables
that have missing data, yet are not used, be sure to check results from
variables that have complete data.
Robin High
UNMC
From:
Warren Schlechte <Warren.Schlechte@TPWD.STATE.TX.US>
To:
SAS-L@LISTSERV.UGA.EDU
Date:
04/18/2011 04:12 PM
Subject:
Proc Reg: Multiple Model Statements
Sent by:
"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
I just discovered that when using Proc Reg (SAS 9.1 and 9.2) with
Multiple Model Statements, only the smallest subset that can be used to
model all the data are used in the model fitting.
Is this well-known and is this by design?
For example, using the following data:
data test;
input X1 X2 Y1 Y2;
datalines;
1 . 2 5
2 . 5 6
3 2 7 7
4 3 8 8
5 4 11 12
;
run;
You see that the pair X1-Y1 has 5 data points, whereas the pair X2-Y2
has only 3 because X2 has 2 missing values.
If you run the following Proc Reg, all 5 data pairs (x1-Y1) are used, as
I would expect.
title "X1 Alone";
proc reg;
model Y1=X1;
run;
If you run this Proc Reg, the 3 data pairs (x2-y2) are used; again, as I
would expect.
title "X2 Alone";
proc reg;
model Y2=X2;
run;
However, when you run the following Proc Reg, only 3 data pairs (x1-Y1)
are used, not the 5 as I would expect.
title "X1 and X2 Together";
proc reg;
model Y1=X1;
model Y2=X2;
run;
Again, is this well-known and is this by design? I looked through the
online docs and it didn't seem clear to me that this was the case, but
my search was by no means exhaustive.
Thanks,
Warren Schlechte
HOH Fisheries Science Center
5103 Junction Hwy
Mt. Home, TX 78058
Phone 830.866.3356 x214
Fax 830.866.3549