In a related question, I asked about individual versus aggregated data and
received a very informative response:
As I am dealing with person-time data, I have a related question of degrees
of freedom. Assuming independence, or stipulate it for the sake of
argument, what would be better for a study:
1) Observations on 500 subjects for 1 year each
2) Observations on 300 subjects for 2 years each
Let's further assume that these subject have been matched, that is have the
same covariate values, so that information (combinations of covariates that
allow for estimability) is not an issue.
I would, perhaps naively, prefer #2. Given some of the conclusions in the
post above, namely that the SE and differences in LL for the LRT are the
same, correctly specifying the DoF may not be critically important.
At least for an exercise, it interests me. If we defined the rate to be
outcomes per person-year, then might we weight the subjects in #2 to
reflect their person-years, i.e. 2?
A sensible follow-up would be what happens with non-integer results: 4.5
person-years, for example?
Any insights, references, critiques, corrections, or flames are greatly