Date: Mon, 11 Aug 2008 10:14:40 -0700
Reply-To: Steve Denham <stevedrd@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Steve Denham <stevedrd@YAHOO.COM>
Subject: Re: Degrees of freedom, person or person-time
Content-Type: text/plain; charset=iso-8859-1
/robot voice on
Must have more information...
/voice off
What sort of outcome are you talking about here? Is a year too large a bin--would month or day be more appropriate? As I look through Dale's response I see he had an offset specified, and it looks like that is what you are talking about here when you mention weighting.
It's a critical point. The offset really needs to accurately reflect the observational period. And in some way the duration of the outcome needs to be considered. You wouldn't have the offset in units of days if the outcome was a month long hospitalization, or vice versa an offset in years if the outcome was an hour long clinic visit.
Steve Denham
Associate Director, Biostatistics
MPI Research, Inc.
Remove spamblock from header, and replace with stevedrd to reply to me.
----- Original Message ----
From: Kevin Viel <citam.sasl@GMAIL.COM>
To: SAS-L@LISTSERV.UGA.EDU
Sent: Monday, August 11, 2008 10:44:00 AM
Subject: Degrees of freedom, person or person-time
In a related question, I asked about individual versus aggregated data and
received a very informative response:
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0808b&L=sas-l&F=&S=&P=8873
As I am dealing with person-time data, I have a related question of degrees
of freedom. Assuming independence, or stipulate it for the sake of
argument, what would be better for a study:
1) Observations on 500 subjects for 1 year each
2) Observations on 300 subjects for 2 years each
Let's further assume that these subject have been matched, that is have the
same covariate values, so that information (combinations of covariates that
allow for estimability) is not an issue.
I would, perhaps naively, prefer #2. Given some of the conclusions in the
post above, namely that the SE and differences in LL for the LRT are the
same, correctly specifying the DoF may not be critically important.
At least for an exercise, it interests me. If we defined the rate to be
outcomes per person-year, then might we weight the subjects in #2 to
reflect their person-years, i.e. 2?
A sensible follow-up would be what happens with non-integer results: 4.5
person-years, for example?
Any insights, references, critiques, corrections, or flames are greatly
appreciated.
Kind regards,
Kevin