|
I see that Robin has already responded about the basic nature
of the problem here. Let me just add the following observations
about Ryan's specific problem.
Ryan has two levels of random effects: city (N~=40) and person
within city (N~=2500 or N~=62/city). The random effects
design matrix requires a column for each city random effect
(intercept, month, and disease*x1 as specified in the model).
In addition, the design matrix includes more than 60 columns
of person within city random effects. I don't know how the
GLIMMIX procedure is constructed - whether the design matrix
for the person random effects is reduced as much as possible
so that the number of columns in the person effects portion
of the design is represented as the number of persons in
the city that has the most individuals (something over 62)
or whether the design is represented with 2500 columns and
thus 2500 random effects. I would hope that it is the former.
Regardless, the quadrature solution requires integrating over
all of the random effects which means at a minimum there would
be 3 + (62+) nested integrals. The computation problem
becomes immense here. Following Robin's post, there would
be at least 3**(65) likelihood evaluations per iteration.
While the GLIMMIX procedure does in theory support estimation
of nested random effect models when a quadrature solution has
been requested, in practice this is limited to models with
a suitably small design matrix. The data employed here do
not have a suitably small design matrix for the random effects.
Thus, I see no hope of fitting this model using quadrature
methods.
Finally, let me post another thought or two about the model
which is fit here. There appear to be four disease classes,
but disease was not named on the CLASS statement. That seems
wrong to me. I would also echo Robin's concerns regarding
month parameterization (class vs continuous). I also have
questions regarding whether month and disease*x1 should be
named on the random statement.
I might take exception to Robin's suggestion to ignore the
person random effects and focus only on the city random
effects. If anything, I would remove the city random effects
and specify only the person random effects. There is
probably more variability in the person random effects than
in the city random effects. You want to account for the main
source of random variation. Thus, it would seem to me that
you would want to model the person random effects. When the
city random effects have been excluded, the person random
effects will include city variability. But the reverse is
not true. When persons are excluded, then their random
effects are averaged over and will almost certainly average
to something quite close to zero within a city - especially
when there are so many persons/city.
Thus, I would suggest dropping the city random effects and
estimate the model with just person random effects. The
nesting problem that causes expansion of the random effect
design matrix disappears when city is dropped from the
model. You should be able to fit the person random effect
model with more quadrature points (although if month and
disease*x1 are kept as random effects and especially if
month and disease have been added to the CLASS statement,
then the model could once again expand beyond what can
reasonably be handled with a large number of quadrature
points).
Dale
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@NO_SPAMfhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
--- On Wed, 11/12/08, Robin R High <rhigh@UNMC.EDU> wrote:
> From: Robin R High <rhigh@UNMC.EDU>
> Subject: Re: Insufficient Resources to Perform Adaptive Qaudrature...
> To: SAS-L@LISTSERV.UGA.EDU
> Date: Wednesday, November 12, 2008, 8:06 AM
> Ryan,
>
> Terms or images to express the "infinite" have
> long been a part of our
> history:
>
> From Genesis 15, God speaks to Abram "Look toward
> heaven and count the
> stars, if you are able to count them." Then he said to
> him, "So shall your
> descendants be."
>
> ...in the early 80s it was "Cosmos" and Carl
> Sagan's famous phrase
> "billions and billions..."
>
> .. and to match wits, when "infinity" looks at
> us, we react with the
> desire for "more POWER, EH, EH!" (Tim the
> Toolman)
>
> .. and now the new SAS 9.2 generation can speak of
> "infinity" through
> "random effects with quadrature", though I
> imagine that Abram or the
> "Cosmos" audience would produce a blank stare if
> such terminology had been
> in vogue.
>
> So, after that digression, please read the section on
> quadrature/laplace
> options in
>
> http://www2.sas.com/proceedings/forum2007/177-2007.pdf
>
> from which the following section is extracted:
>
> ***begin quote
> proc glimmix method=quad(qpoints=5);
> class A B id;
> model y = / dist=negbin;
> random A / subject=id;
> run;
>
> Computing the marginal log likelihood requires the
> numerical evaluation of
> a four-dimensional integral. As
> part of this evaluation, 54 = 625 conditional log
> likelihoods need to be
> computed per observation on each
> pass through the data. Now suppose that an additional
> random effect with b
> = 2 levels is added and the
> previous RANDOM statement becomes the following:
>
> random A A*B / subject=id;
>
> For each observation, this now requires evaluation of
> 5**(4+8) =
> 244,140,625 [commas added for clarity] conditional log
> likelihoods on each
> pass through the data.
> ***end quote
>
> .. whatever "each pass" actually means, remember
> that GLIMMIX needs to
> iterate and reiterate to find a solution, which I suppose
> like the
> national debt or "bailout" figures we've been
> hearing gets rather large
> fairly quickly. I suppose this is one reason that I once
> recall hearing
> Oliver S. himself say that he was reluctant to include quad
> as an option
> with GLIMMIX (though it is nice to now be able to produce
> the same results
> with GLIMMIX that one would produce with NLMIXED).
>
> So, in my opinion, don't hestitate to work with other
> estimation routines,
> esp. when you have a reasonably large sample size (which
> you do) and the
> random effects aren't too "big".
>
> And finally, just a few thoughts about your code, one can
> always start
> with a simpler model and work up in complexity, if
> possible. And I'm
> curious why is month treated as a linear covariate, esp.
> one treated as
> having a random slope within each city? Perhaps quarterly
> for a seasonal
> categorical effect might be a suggestion. And the following
> code would
> likely capture the essence of the problem, though with
> binary data it may
> be computationally difficult to estimate random subject
> effects, even
> though conceptually it would be important.
>
> proc glimmix data=mydata ;
> class Person_ID City_ID; *include month or a quarterly
> effect here?? ;
> model y = Disease x1 month disease*x1 / s dist=binary;
> random intercept x1 / subject=City_ID ; * would a
> type=<another option>
> work better? ;
> run;
>
> .. as always, something new to learn or features to
> consider,esp. now with
> 9.2 available...
>
> Robin High
> UNMC
>
> Note: not sure why the original message became garbled when
> I look at it
> on the SAS-L archives, so thought I would re-submit
> hopefully with this
> version of the note now in an ungarbled format.
>
>
>
>
>
> Ryan <Ryan.
|