Date: Wed, 23 Jul 2003 12:17:07 -0700
Reply-To: Dale McLerran <stringplayer_2@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dale McLerran <stringplayer_2@YAHOO.COM>
Subject: Re: GEE with multiple treatments per subject: how to model?
In-Reply-To: <3d935f86.0307220914.35274940@posting.google.com>
Content-Type: text/plain; charset=us-ascii
Margarita,
There are a couple of possible ways to analyze these data. One
possible approach would be to employ a bootstrap process, although
I am not certain right off-hand what the resample units should be.
Given that you have correlations due to both visit and treatment
and you have an ordinal response, I would suggest that you try
to fit an ordinal response mixed model using the procedure
NLMIXED. Note that other correspondents suggested GLIMMIX
(which is not a procedure, but a macro) and MIXED. Neither of
these would be suitable for an ordinal response variable.
However, the NLMIXED procedure has the flexibility to fit a
proportional odds mixed model. The only problem that you
might face with NLMIXED is the number of random effects that
you must estimate. NLMIXED is limited in this regard, at least
in comparison with the MIXED procedure.
To use the NLMIXED procedure, you will have to write down the
likelihood function for an ordinal response subject to random
effects. For an ordinal response with k+1 levels of response,
the probability that y=i is given by
p(y=0) = exp(eta0) / (1 + exp(eta0))
p(y=1) = exp(eta1) / (1 + exp(eta1)) -
exp(eta0) / (1 + exp(eta0))
p(y=2) = exp(eta2) / (1 + exp(eta2)) -
exp(eta1) / (1 + exp(eta1))
...
p(y=k) = 1 - exp(eta[k-2]) / (1 + exp(eta[k-2))
where eta0, eta1, ..., eta[k-2] are linear combinations of the
predictor variables, including any random effects. Given the
complexity of your model, I would suggest using a common
random intercept for all k-1 linear combinations for the
construction of p(y=i) by treatment and visit. That is, I
would suggest
eta0 = trt1_0*(trt=1) + trt2_0*(trt=2) + trt3_0*(trt=3) +
trt1_vis2*(trt=1)*(visit=2) +
trt1_vis3*(trt=1)*(visit=3) +
trt2_vis2*(trt=2)*(visit=2) +
trt2_vis3*(trt=2)*(visit=3) +
trt3_vis2*(trt=3)*(visit=2) +
trt3_vis3*(trt=3)*(visit=3) +
gtrt1_visit1*(trt=1)*(visit=1) +
gtrt1_visit2*(trt=1)*(visit=2) +
gtrt1_visit3*(trt=1)*(visit=3) +
gtrt2_visit1*(trt=2)*(visit=1) +
gtrt2_visit2*(trt=2)*(visit=2) +
gtrt2_visit3*(trt=2)*(visit=3) +
gtrt3_visit1*(trt=3)*(visit=1) +
gtrt3_visit2*(trt=3)*(visit=2) +
gtrt3_visit3*(trt=3)*(visit=3);
eta1 = trt1_1*(trt=1) + trt2_1*(trt=2) + trt3_1*(trt=3) +
trt1_vis2*(trt=1)*(visit=2) +
trt1_vis3*(trt=1)*(visit=3) +
trt2_vis2*(trt=2)*(visit=2) +
trt2_vis3*(trt=2)*(visit=3) +
trt3_vis2*(trt=3)*(visit=2) +
trt3_vis3*(trt=3)*(visit=3) +
gtrt1_visit1*(trt=1)*(visit=1) +
gtrt1_visit2*(trt=1)*(visit=2) +
gtrt1_visit3*(trt=1)*(visit=3) +
gtrt2_visit1*(trt=2)*(visit=1) +
gtrt2_visit2*(trt=2)*(visit=2) +
gtrt2_visit3*(trt=2)*(visit=3) +
gtrt3_visit1*(trt=3)*(visit=1) +
gtrt3_visit2*(trt=3)*(visit=2) +
gtrt3_visit3*(trt=3)*(visit=3);
eta2 = trt1_2*(trt=1) + trt2_2*(trt=2) + trt3_2*(trt=3) +
trt1_vis2*(trt=1)*(visit=2) +
trt1_vis3*(trt=1)*(visit=3) +
trt2_vis2*(trt=2)*(visit=2) +
trt2_vis3*(trt=2)*(visit=3) +
trt3_vis2*(trt=3)*(visit=2) +
trt3_vis3*(trt=3)*(visit=3) +
gtrt1_visit1*(trt=1)*(visit=1) +
gtrt1_visit2*(trt=1)*(visit=2) +
gtrt1_visit3*(trt=1)*(visit=3) +
gtrt2_visit1*(trt=2)*(visit=1) +
gtrt2_visit2*(trt=2)*(visit=2) +
gtrt2_visit3*(trt=2)*(visit=3) +
gtrt3_visit1*(trt=3)*(visit=1) +
gtrt3_visit2*(trt=3)*(visit=2) +
gtrt3_visit3*(trt=3)*(visit=3);
eta3 = trt1_3*(trt=1) + trt2_3*(trt=2) + trt3_3*(trt=3) +
trt1_vis2*(trt=1)*(visit=2) +
trt1_vis3*(trt=1)*(visit=3) +
trt2_vis2*(trt=2)*(visit=2) +
trt2_vis3*(trt=2)*(visit=3) +
trt3_vis2*(trt=3)*(visit=2) +
trt3_vis3*(trt=3)*(visit=3) +
gtrt1_visit1*(trt=1)*(visit=1) +
gtrt1_visit2*(trt=1)*(visit=2) +
gtrt1_visit3*(trt=1)*(visit=3) +
gtrt2_visit1*(trt=2)*(visit=1) +
gtrt2_visit2*(trt=2)*(visit=2) +
gtrt2_visit3*(trt=2)*(visit=3) +
gtrt3_visit1*(trt=3)*(visit=1) +
gtrt3_visit2*(trt=3)*(visit=2) +
gtrt3_visit3*(trt=3)*(visit=3);
You will observe that the effects trt1_i, trt2_i, and trt3_i are
intercepts modelling the probability of response level i for each
of the three outcomes at the first visit. The terms trt1_vis2,
trt1_vis3, ..., trt3_vis3 are deviations from the intercepts at
subsequent visits. We are fitting a proportional odds model, so
that the same deviations from the intercept apply to all levels
of the response. Terms gtrt1_visit1, gtrt1_visit2, ...,
gtrt3_visit3 are visit by treatment specific random intercept terms
that apply to all levels of the response variable. Having constructed
eta[i] and p[i], the log likelihood contribution can be constructed
as log(p[i]). We can now put all this together as NLMIXED code. I
will construct a model in which the random effects have an
unstructured covariance for the multivariate response, and have
an AR(1) structure over time. You may wish to modify this to have
a compound symmetric error structure over time or an unstructured
error structure for the repeated measures component.
proc nlmixed data=mydata;
parms b0=0 b1=0 b2=0 ... b[k-1]=0
trt1=0 trt2=0
visit1=0 visit2=0
trt1_visit1=0 trt1_visit2=0 trt2_visit1=0 trt2_visit2=0
logsig_v1=0 logsig_v2=0 logsig_v3=0
rho12=0 rho13=0 rho23=0 rho_t=0;
bounds -1<rho12<1, -1<rho13<1, -1<rho23<1, -1<rho_t<1;
eta0 = trt1_0*(trt=1) + trt2_0*(trt=2) + trt3_0*(trt=3) +
trt1_vis2*(trt=1)*(visit=2) +
trt1_vis3*(trt=1)*(visit=3) +
trt2_vis2*(trt=2)*(visit=2) +
trt2_vis3*(trt=2)*(visit=3) +
trt3_vis2*(trt=3)*(visit=2) +
trt3_vis3*(trt=3)*(visit=3) +
gtrt1_visit1*(trt=1)*(visit=1) +
gtrt1_visit2*(trt=1)*(visit=2) +
gtrt1_visit3*(trt=1)*(visit=3) +
gtrt2_visit1*(trt=2)*(visit=1) +
gtrt2_visit2*(trt=2)*(visit=2) +
gtrt2_visit3*(trt=2)*(visit=3) +
gtrt3_visit1*(trt=3)*(visit=1) +
gtrt3_visit2*(trt=3)*(visit=2) +
gtrt3_visit3*(trt=3)*(visit=3);
eta1 = trt1_1*(trt=1) + trt2_1*(trt=2) + trt3_1*(trt=3) +
trt1_vis2*(trt=1)*(visit=2) +
trt1_vis3*(trt=1)*(visit=3) +
trt2_vis2*(trt=2)*(visit=2) +
trt2_vis3*(trt=2)*(visit=3) +
trt3_vis2*(trt=3)*(visit=2) +
trt3_vis3*(trt=3)*(visit=3) +
gtrt1_visit1*(trt=1)*(visit=1) +
gtrt1_visit2*(trt=1)*(visit=2) +
gtrt1_visit3*(trt=1)*(visit=3) +
gtrt2_visit1*(trt=2)*(visit=1) +
gtrt2_visit2*(trt=2)*(visit=2) +
gtrt2_visit3*(trt=2)*(visit=3) +
gtrt3_visit1*(trt=3)*(visit=1) +
gtrt3_visit2*(trt=3)*(visit=2) +
gtrt3_visit3*(trt=3)*(visit=3);
eta2 = trt1_2*(trt=1) + trt2_2*(trt=2) + trt3_2*(trt=3) +
trt1_vis2*(trt=1)*(visit=2) +
trt1_vis3*(trt=1)*(visit=3) +
trt2_vis2*(trt=2)*(visit=2) +
trt2_vis3*(trt=2)*(visit=3) +
trt3_vis2*(trt=3)*(visit=2) +
trt3_vis3*(trt=3)*(visit=3) +
gtrt1_visit1*(trt=1)*(visit=1) +
gtrt1_visit2*(trt=1)*(visit=2) +
gtrt1_visit3*(trt=1)*(visit=3) +
gtrt2_visit1*(trt=2)*(visit=1) +
gtrt2_visit2*(trt=2)*(visit=2) +
gtrt2_visit3*(trt=2)*(visit=3) +
gtrt3_visit1*(trt=3)*(visit=1) +
gtrt3_visit2*(trt=3)*(visit=2) +
gtrt3_visit3*(trt=3)*(visit=3);
eta3 = trt1_3*(trt=1) + trt2_3*(trt=2) + trt3_3*(trt=3) +
trt1_vis2*(trt=1)*(visit=2) +
trt1_vis3*(trt=1)*(visit=3) +
trt2_vis2*(trt=2)*(visit=2) +
trt2_vis3*(trt=2)*(visit=3) +
trt3_vis2*(trt=3)*(visit=2) +
trt3_vis3*(trt=3)*(visit=3) +
gtrt1_visit1*(trt=1)*(visit=1) +
gtrt1_visit2*(trt=1)*(visit=2) +
gtrt1_visit3*(trt=1)*(visit=3) +
gtrt2_visit1*(trt=2)*(visit=1) +
gtrt2_visit2*(trt=2)*(visit=2) +
gtrt2_visit3*(trt=2)*(visit=3) +
gtrt3_visit1*(trt=3)*(visit=1) +
gtrt3_visit2*(trt=3)*(visit=2) +
gtrt3_visit3*(trt=3)*(visit=3);
exp_eta0 = exp(eta0);
exp_eta1 = exp(eta1);
exp_eta2 = exp(eta2);
exp_eta3 = exp(eta3);
p0 = exp_eta0 / (1 + exp_eta0);
p1 = exp_eta1 / (1 + exp_eta1) -
exp_eta0 / (1 + exp_eta0);
p2 = exp_eta2 / (1 + exp_eta2) -
exp_eta1 / (1 + exp_eta1);
p3 = exp_eta3 / (1 + exp_eta3) -
exp_eta2 / (1 + exp_eta2);
p4 = 1 - exp_eta3 / (1 + exp_eta3);
ll0 = log(p0);
ll1 = log(p1);
ll2 = log(p2);
ll3 = log(p3);
ll4 = log(p4);
if y=0 then loglike=ll1; else
if y=1 then loglike=ll1; else
if y=2 then loglike=ll2; else
if y=3 then loglike=ll3; else
if y=4 then loglike=ll4; else
model y ~ general(loglike);
random gtrt1_visit1 gtrt2_visit1 gtrt3_visit1
gtrt1_visit2 gtrt2_visit2 gtrt3_visit2
gtrt1_visit3 gtrt2_visit3 gtrt3_visit3
normal([0,0,0,0,0,0,0,0,0],
[exp(2*logsig_v1),
rho12*exp(logsig_v1+logsig_v2),
rho13*exp(logsig_v1+logsig_v3),
rho_t*exp(2*logsig_v1),
rho_t*rho12*exp(logsig_v1+logsig_v2),
rho_t*rho13*exp(logsig_v1+logsig_v3),
rho_t*rho_t*exp(2*logsig_v1),
rho_t*rho_t*rho12*exp(logsig_v1+logsig_v2),
rho_t*rho_t*rho13*exp(logsig_v1+logsig_v3),
exp(2*logsig_v2),
rho23*exp(logsig_v2+logsig_v3),
rho_t*rho12*exp(logsig_v1+logsig_v2),
rho_t*exp(2*logsig_v2),
rho_t*rho23*exp(logsig_v2+logsig_v3),
rho_t*rho_t*rho12*exp(logsig_v1+logsig_v2),
rho_t*rho_t*exp(2*logsig_v2),
rho_t*rho_t*rho23*exp(logsig_v2+logsig_v3),
exp(2*logsig_v3),
rho_t*rho13*exp(logsig_v1+logsig_v3),
rho_t*rho23*exp(logsig_v2+logsig_v3),
rho_t*exp(2*logsig_v3),
rho_t*rho_t*rho13*exp(logsig_v1+logsig_v3),
rho_t*rho_t*rho23*exp(logsig_v2+logsig_v3),
rho_t*rho_t*exp(2*logsig_v3),
exp(2*logsig_v1),
rho12*exp(logsig_v1+logsig_v2),
rho13*exp(logsig_v1+logsig_v3),
rho_t*exp(2*logsig_v1),
rho_t*rho12*exp(logsig_v1+logsig_v2),
rho_t*rho13*exp(logsig_v1+logsig_v3),
exp(2*logsig_v2),
rho23*exp(logsig_v2+logsig_v3),
rho_t*rho12*exp(logsig_v1+logsig_v2),
rho_t*exp(2*logsig_v2),
rho_t*rho23*exp(logsig_v2+logsig_v3),
exp(2*logsig_v3),
rho_t*rho13*exp(logsig_v1+logsig_v3),
rho_t*rho23*exp(logsig_v2+logsig_v3),
rho_t*exp(2*logsig_v3),
exp(2*logsig_v1),
rho12*exp(logsig_v1+logsig_v2),
rho13*exp(logsig_v1+logsig_v3),
exp(2*logsig_v2),
rho23*exp(logsig_v2+logsig_v3),
exp(2*logsig_v3)])
subject=subject;
contrast "Any difference in treatment effects"
trt2_0-trt1_0, trt3_0-trt1_0,
trt2_1-trt1_1, trt3_1-trt1_1,
trt2_2-trt1_1, trt3_2-trt1_1,
trt2_3-trt1_1, trt3_3-trt1_1,
trt1_vis2, trt1_vis3,
trt2_vis2, trt2_vis3,
trt3_vis2, trt3_vis3;
contrast "Any difference over time in treatment effects"
trt1_vis2, trt1_vis3,
trt2_vis2, trt2_vis3,
trt3_vis2, trt3_vis3;
run;
If you change references to rho_t*rho_t to simply rho_t, then
you can fit the compound symmetric repeated measures covariance
structure. This model may need to be executed on a machine with
a lot of RAM. I have certainly fit nonlinear mixed models using
the NLMIXED procedure with as many random effects as you have
in the above problem. However, these models are computationally
intensive in memory requirements as well as in the amount of
time required to achieve convergence. You may need to be
very patient for the model to complete.
Certainly, you could dichotomize your response and use the
GLIMMIX macro to fit the model. Note that there is not a
GLIMMIX procedure, as someone suggested. GLIMMIX is a macro
that iteratively calls the MIXED procedure. You actually
pass PROC MIXED code in the invocation of the GLIMMIX macro.
But if your random effects are large (as I suspect that they
will be in this study), then GLIMMIX is probably not acceptable.
If you have trouble fitting the random effects proportional
odds model with NLMIXED, then you could modify the NLMIXED
problem to fit the somewhat simpler model for a binary
response.
Having written a thesis here, and given you plenty to digest
already, I think I will stop at this point. Good luck. I
would be interested to know if you try NLMIXED for the
random effects model with ordinal response variable, and if
so how the model works for you.
Dale
--- Margarita <rlopatin@YAHOO.COM> wrote:
> Jay Weedon <jweedon@earthlink.net> wrote in message
> news:<tleqhvsc733t21odannkhc5hk36b9eajm5@4ax.com>...
> > On 21 Jul 2003 15:54:40 -0700, rlopatin@yahoo.com (Margarita)
> wrote:
> >
> > >-Hello all,
> > >
> > >I am trying to fit a GEE model for the following study design:
> every
> > >subject in a study was exposed to all study treatments (the
> treatments
> > >are local in nature and do not interact) at baseline. The
> response, an
> > >irritation score on the ordinal scale with more than two levels,
> is
> > >recoded at each treatment site at several time points. Some
> people
> > >are more likely to have irritation than others (no matter what
> > >treatment is applied) so one would expect a correlation between
> > >treatment responses within a subject, as well as between time
> points.
> > > The objective is to see if some treatments are more likely to
> lead to
> > >higher irritation scores than others.
> > >
> > >I would appreciate any advice on what would be the best way to
> model
> > >this either using SAS version 8 procedures, Splus or custom
> macros.
> > >
> > >I've tried proc GENMOD so far but am not convinced it's doing the
> > >right thing. It allows only independent working correlation
> matrix
> > >with multinomial response, which it seems would ignore the
> correlation
> > >between time measurements & treatment responses.
> >
> > Post details of your problem and SAS code to comp.soft-sys.sas.
> >
> > JW
>
> -Jay,
>
> Here is an example of the data:
> subject trt visit y
> 1 A 1 0
> 1 B 1 1
> 1 C 1 1
> 2 A 1 1
> 2 B 1 1
> 2 C 1 0
> ....
> 1 A 2 0
> 1 B 2 0
> 1 C 2 1
> ....
> Each subject was exposed to, say, 3 treatments at the start of the
> study. The irritation levels are recorded on ordinal scale (0 to 4)
> at several visits. I used the following code for analysis:
>
> proc genmod data=irritation descending;
> class trt visit subject;
> model y= trt visit /dist=mult link=clogit type3;
> repeated subject=subject / type=ind withinsubject=visit*trt;
> run;
>
> This what I received from the SAS technical support:
> "Only TYPE=IND is available for multinomial responses. Once there is
> ample statistical literature supporting the other correlation
> matrices, they will be added to GENMOD. Currently this is an area of
> research for many statisticians.
>
> Something to keep in mind is that the Emperical based standard errors
> make no assumptions regarding the correlation matrix. They require
> only that your mean model be correctly specified. This means that
> they are robust no matter what your choice of correlation matrix."
>
> So, it seems there is no way to model ordinal data accounting for
> within subject correlation in SAS. If anyone heard of a program that
> can do that, please let me know.
>
> I have received a reference to GLIMMIX from Iyue Sung. My assumption
> has been that GLM deals with continuos responses but I'll check it
> out, just in case.
>
> In terms of analysis, I also tried dichotomizing the response and
> analyze it with AR(1) correlation matrix. The results seem to be
> consistent.
>
> Thanks
> Margarita
=====
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@fhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com
|