Date: Mon, 26 Nov 2007 21:49:02 -0500
Reply-To: Peter Flom <email@example.com>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Flom <peterflomconsulting@MINDSPRING.COM>
Subject: Re: Group size needed for mixed model (binary response)
Content-Type: text/plain; charset=UTF-8
Susan Lingle <susan.lingle@ULETH.CA> wrote
>My question is a statistical one, not anything specific to use of SAS.
> From reading the archives, there are clearly many knowledgeable people
>out there, and I am hoping someone can advise whether a mixed model is
>appropriate to use to analyse my data.
Until a knowledgeable person comes along, I'll try to answer :-)
>I have a data set for deer fawns, in which I want to test whether fawns
>of one species, white-tailed deer, are more likely to die from predation
>during the first few months of life (summer) than mule deer. I plan to
>run a separate analysis to test whether the other species, mule deer are
>more likely than whitetails to die during winter. For the summer sample,
>there are 129 whitetail fawns from 124 mothers and 207 mule deer fawns
>from 177 mothers. For the winter sample, there are 26 whitetail fawns
>from 25 mothers, and 129 mule deer from 103 mothers. Of course there is
>only one measurement (live or die) for each fawn.
>Someone strongly recommended that I use a GLMM with the mother's
>identity as a random factor to analyse the survival data (e.g., GLIMMIX
>in SAS). I certainly appreciate the value of including family effects as
>random factors when there is a large enough family to estimate those
>effects, or the variance associated with those effects. But in this
>case, most females have one fawn so the data appear insufficient to
>estimate random effects or the variance, and I believe the latter is
>needed to estimate an intercept.
>I have searched far and wide for an answer. The closest thing I found,
>and it seems to make sense, is an article suggesting that a large group
>size (n=50) as well as a large number of groups (n=100) are needed for a
>mixed effects logistic regression to produce decent estimates of fixed
>effects as well as random effects (citation below). They found severe
>flaws in estimating fixed as well as random effects when group size was
>less than 5. Apparently, the sample size issues are not as restrictive
>for linear models, although I get the impression one still would need
>more than n=5 for each group.
>It is appropriate to use mixed models for binary DV, or even for linear
>DV, when the groups usually consist of 1 individuals and at most 2
First of all, thank you for providing the context needed to try to answer the question. Very nice.
Second, no, I don't think you want a mixed model. I don't think it's appropriate. Rather, I think you should find the mothers with multiple fawns, and randomly choose 1 fawn. Then your data are independent. I don't want to give an exact number needed per group, but clearly one per group is not enough.
Hope this helps