Date: Fri, 25 Jan 2008 12:11:11 -0800
Reply-To: Robin High <robinh@UOREGON.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Robin High <robinh@UOREGON.EDU>
Subject: Re: Using survival analysis for maternal gestation data
In-Reply-To: A<200801251805.m0PGqbxq003481@mailgw.cc.uga.edu>
Content-Type: text/plain; charset="utf-8"
A few observations:
When one examines the likelihood functions “censoring” is a nice feature of survival models, but it is not essential to their computation or applicability to “time to event” data. For example, LIFEREG will do normal theory regression models with MLE estimates; with right or left censoring it is a “Tobit” model.
Most of the survival techniques I’m familiar with are designed for “continuous” time to event data, which are typically skewed to the right. But even more relevant here, with an observational study such as this, the data seem much more discrete than continuous, esp when I doubth the precise moment of conception is not known with any real accuracy, so a relatively few ordinal (8?) integer response values (30 31 .. 37) seem to be the best-case, and even with them measurement error would exist. And when the percent of a few of the weeks are now stated, it’s difficult to imagine a normal theory model would be all that more helpful or even relevant here.
Rather than dichotomizing, you could perhaps divide the weeks into 3 or 4 meaningful response levels (e.g., 30-33, 34-36, 37 ???) and apply an ordinal logistic model (preferably one that has the proportional odds assumption satisfied) to look at the very early deliveries versus normal. Though I’m generally against binning numerical data, this is one case where the interpretation with odds ratios may favor doing so. And it would allow you to control for the other variables as well.
With 40,000 cases you’ll likely get highly significant results, so would need to interpret them in the context of what is a difference of real interest.
Robin High
University of Oregon
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Robert Feyerharm
Sent: Friday, January 25, 2008 10:05 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Using survival analysis for maternal gestation data
Thanks for the comments & suggestions, everyone.
Regarding whether the Cox PH model is appropriate to use on my data, I
performed a test to determine if the proportional hazards assumption is
correct for my data. The results indicate that the hazard ratio is not
proportional, therefore I can't use the Cox PH model in this case.
Unfortunately, this means I can't use a multivariate survival analysis in
order to control for other variables (race, education, etc.).
Another issue is the lack of censored records in my dataset. I suppose a
true censored event would be a mother who didn’t deliver during the time
data was being collected for the dataset. Since I know when every mother
delivered, I don’t think there would be any censored events. However, I
checked Kleinbaum & Klein's text and there are no explicit warnings against
using survival analysis with uncensored data.
I tried the Wilcoxon Rank Sum test and there was sufficient evidence to
reject the null (H0 = no difference in gestational periods) in support of
mothers with gonorrhea having shorter gestational periods. Although I was
fortunate to have a large dataset - if my dataset had fewer records, or if
on average STD positive moms had only slightly shorter gestational periods,
then a Wilcoxon Rank Sum or t-test may not have been powerful enough to
detect a difference - which is one reason I considered a survival analysis
model.
For example, let's say for Group A pregnancies: 5% of deliveries are in
week 30, 6% are in week 31,..., 10% are in week 37; and for Group B
pregnancies: 5.1% are in week 30, 6.1% are in week 31,...,10.1% are in week
37. Here Group B pregnancies show a small, but consistent increase in
preterm birth delivery frequencies. If a survival analysis model isn't
appropriate here, what about using a more powerful paired t-test or
Wilcoxon Signed Rank Test to compare the *difference* in the % (or
proportion) of deliveries during each week of pregnancy between the two
groups?
Thanks,
Robert
|