Date: Fri, 3 Sep 2004 10:22:48 -0700
Reply-To: Dale McLerran <stringplayer_2@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dale McLerran <stringplayer_2@YAHOO.COM>
Subject: Re: Comparing multiple proportions: The Marascuillo procedure
In-Reply-To: <200409031552.i83FqX6p032028@listserv.cc.uga.edu>
Content-Type: text/plain; charset=us-ascii
Stephen,
I don't believe that SAS has such a procedure. However, it could
easily be implemented using data step code. Whether you should
or not is another question. I am not sure that the Marascuillo
multiple proportion comparison procedure is all that great. In
the example given at the website that you referred to, the global
test for differences among the proportions is significant at the
alpha=0.05 level. However, none of the pairwise contrasts are
significantly different. This seems quite peculiar, especially
since the number of samples for each of the five lots being
compared are equal. The global test has enough information to
find significant differences in the sample defect rate, but there
is not enough information to identify even one contrast where
there is a difference?
Now, how about an easily implemented alternative. An alternative
to the chi-square test for equality of proportions would be to
fit a logistic regression model with defects as the response and
lot as the predictor variable. Now, the procedures LOGISTIC,
CATMOD, and GENMOD can all fit a logistic regression, but none
have an LSMEANS statement which allows multiple comparisons.
(GENMOD has an LSMEANS statement, but it does not support
multiple comparisons testing. The other two do not have any
LSMEANS statement.) However, you can use the GLIMMIX macro - or
the new GLIMMIX procedure! - to fit a logistic regression model.
Both the GLIMMIX macro and procedure have LSMEANS statements which
allow multiple comparison tests to be performed.
Let me demonstrate with the data presented on the website that
you referred to.
/* Construct data of defect rates in 5 lots */
data test;
samples=300;
lot=1; defects=36; link binary;
lot=2; defects=46; link binary;
lot=3; defects=42; link binary;
lot=4; defects=63; link binary;
lot=5; defects=38; link binary;
binary:
/* Output a record for each of the 1800 observations */
do i=1 to defects;
defect=1;
output;
end;
do i=1 to samples-defects;
defect=0;
output;
end;
return;
run;
/* Construct the usual chi-square test that the */
/* proportion of defects are the same across all lots */
proc freq data=test;
tables defect*lot / chisq;
run;
/* Use the GLIMMIX macro to */
/* 1) construct a global test that can be compared */
/* with the chi-square test formed above */
/* 2) form and test pairwise comparisons, allowing */
/* for multiple comparison adjustments */
%glimmix(data=test,
stmts=%str(class lot;
model defect = lot;
lsmeans lot / diff adjust=tukey;
))
You will observe that the overall F-test for the effect of LOT
has very nearly the same p-value as the chi-square test.
Asymptotically, the two tests would be equivalent. We may
actually prefer the overall F-test obtained from the logistic
regression - especially if the sample sizes were small. Here,
sample size is large enough that the two approaches yield very
similar global test statistics.
Having satisfied that the global test is providing an equivalent
overall test, we can look at the pairwise comparisons obtained
out of the LSMEANS statement. I employed a Tukey multiple
comparison adjustment. You will observe that three comparisons
are identified as significant
1) the contrast between lots 1 and 4
2) the contrast between lots 3 and 4
3) the contrast between lots 5 and 4
In addition, the contrast between lots 2 and 4 is significant
at alpha<0.075. None of the other contrasts even approach
significance. Thus, our test identifies significant pairwise
contrasts that the Marascuillo test does not.
It is easy to generalize this to a model where you have multiple
factors (lots within week and lots between weeks). I think that
it would be best to construct your data with day of the week
and week. Then you could test whether there were more failures
on, say, Monday or on Friday. A priori hypotheses about day
of the week may be dealt with employing GLIMMIX. You probably
do not want to employ multiple comparison tests where an a priori
hypothesis could be constructed. You would likely not have any
a priori hypothesis that a certain weeks would have more failures
than other weeks. Thus, you could use multiple comparison
adjustments to test pairwise differences between weeks. You
could test specific day of week contrasts without resorting to
the penalty that a multiple comparisons procedure produces.
Note that the GLIMMIX procedure is available only for SAS version
9.1 with Windows OS. The GLIMMIX macro has been distributed
with SAS for quite a few years, so should be readily available.
Dale
--- Stephen Arthur <sdaemail@YAHOO.COM> wrote:
> Hello,
>
> Is the Marascuillo multiple proportion comparison procedure included
> in
> SAS?
>
> http://www.itl.nist.gov/div898/handbook/prc/section4/prc474.htm
>
> I did a search on the SAS website and the searches:
> 1) Marascuillo
> 2) +"multiple comparison" and proportion
>
> turned up zero results.
> http://sas.com/search/index.html
>
> Also, does anyone know of a two-way Marascuillo multiple proportion
> comparison procedure... maybe this doesn't make sense, I'll have to
> think
> about it some more.
>
> Basically, I have an analysis situation where I want to test the
> significance of proportions within a week, and between weeks. The
> data is
> very randomly collected (unbalance sample sizes and missing data),
> these
> are not controlled studies.
>
> Thanks,
>
> Stephen
>
=====
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@fhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
_______________________________
Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.
http://promotions.yahoo.com/goldrush