| Date: | Mon, 7 Mar 2005 16:37:00 -0800 |
| Reply-To: | Dale McLerran <stringplayer_2@YAHOO.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
|
| From: | Dale McLerran <stringplayer_2@YAHOO.COM> |
| Subject: | Re: NLMIXED: REPLICATE statement |
| In-Reply-To: | <20050306141940.61457.qmail@web26406.mail.ukl.yahoo.com> |
| Content-Type: | text/plain; charset=us-ascii |
|---|
--- "adel F." <adel_tangi@yahoo.fr> wrote:
> Hi Dale,
> I try also a syntax using the binomial distribution
>
> model COUNT ~ binomial(total,p);
>
> random u~normal(0,exp(2*Log(s2u)))
> with count is the count for Y=1 by region*X1*X2 and total is the
> total of the frequency by region*X1*X2 for both Y=1 and Y=0
>
> I use the proc freq to create a frequency for both Y=1 and Y=0 by
> region*X1*X2 and then I use proc sql to create the total.
>
>
> proc freq data=mydata;
>
> tables region*X1*X2*Y_dep/ out= test1;
>
> run;
>
> /*------------create total for Y_dep-------------*/
>
> proc sql;
>
> create table test2 as select distinct
> region,X1,X2,Y_dep,COUNT,sum(COUNT) as total from
>
> test1 group by region,X1,X2;
>
> quit;
>
> /*-----------slect data for Y_Dep=1-------*/
>
> data test3;
>
> set test2;
>
> where Y_Dep=1;
>
> run;
>
> Then I use data test3 with binomial distribution
>
> for variable "count" with p defined as previuosly, and the variable
> "total"
>
> The model gives very similar results to the approach that you suggest
> using a general likelihood
>
>
> Any remarks or questions will be usefuls
>
> Thanks
> Adel
Adel,
If your variance components are small, then the bernoulli and
binomial models will be quite similar. However, if the variance
components are large, then these two models can differ greatly.
Take a look at the following simulation:
%macro Gen_fit_REmodels(root_var=, b0=, b1=, seed=);
/* Generate some correlated binary responses */
data test;
b0 = &b0;
b1 = &b1;
do ID=1 to 50;
subject_effect = &root_var*rannor(1234579);
do i=1 to 40;
a = ceil(2*ranuni(&seed)) - 1;
eta = b0 + b1*a + subject_effect;
expeta = exp(eta);
p = expeta / (1 + expeta);
y = ranbin(&seed,1,p);
output;
end;
end;
keep ID y a;
run;
/* Construct one observation per ID/predictor/response */
/* combination with frequency of combination attached. */
proc sql;
create table test1 as select distinct id, a, y, count(*) as freq
from test
group by id, a, y;
quit;
/* Construct total for each ID/predictor combination. */
proc sql;
create table test2 as
select distinct id, a, y, freq, sum(freq) as total
from test1
group by id, a;
quit;
/*-----------select data for Y=1-------*/
data test3;
set test2;
where Y=1;
run;
title "Fit random effect model for bernoulli response";
title2 "employing aggregated data";
proc nlmixed data=test1;
eta = b0 + b1*a + u;
expeta = exp(eta);
p = expeta / (1 + expeta);
if y=1 then loglike = log(p);
else loglike = log(1-p);
loglike = freq*loglike;
model y ~ general(loglike);
random u ~ normal(0, exp(2*logsu)) subject=id;
/* Test whether the parameter estimates are consistent */
/* with the values employed to simulate the data. */
contrast "b0=&b0" b0-&b0;
contrast "b1=&b1" b1-&b1;
contrast "sqrt(Var(RE))=&root_var" exp(logsu)-&root_var;
run;
title "Fit random effect model for binomial response";
proc nlmixed data=test3;
eta = b0 + b1*a + u;
expeta = exp(eta);
p = expeta / (1 + expeta);
model freq ~ binomial(total,p);
random u ~ normal(0, exp(2*logsu)) subject=id;
/* Test whether the parameter estimates are consistent */
/* with the values employed to simulate the data. */
contrast "b0=&b0" b0-&b0;
contrast "b1=&b1" b1-&b1;
contrast "sqrt(Var(RE))=&root_var" exp(logsu)-&root_var;
run;
%mend;
%Gen_fit_REmodels(root_var=0.25, b0=0, b1=0.5, seed=1234579);
Note that for the small variance component employed here, the
two approaches produce the same point estimates and standard
errors. Since the point estimates and standard errors are
identical, then we have identical tests for all contrasts.
But if we repeat this experiment with a larger variance
for our random effect, we observe quite different results.
Run the model employing
%Gen_fit_REmodels(root_var=5, b0=0, b1=0.5, seed=1234579);
You observe that the point estimates and their standard errors
are now quite different. Our test that confidence limits for
the point estimates include the true values are not rejected
when we assume a bernoulli distribution. However, the
confidence intervals do not include the true values for the
intercept and the square root of the variance component when
the binomial model is fit.
Sorry, I don't have time to provide further comment at this
time. I'll just suggest that you try other values for the
parameters of the model, try other seed values, etc. You should
observe that if the data are generated as a bernoulli response
subject to random effects, then you cannot assume that a
random effects binomial response model provides correct
results.
Dale
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
__________________________________
Celebrate Yahoo!'s 10th Birthday!
Yahoo! Netrospective: 100 Moments of the Web
http://birthday.yahoo.com/netrospective/
|