Date:         Tue, 10 May 2005 10:02:49 -0700
Reply-To:     cassell.david@EPAMAIL.EPA.GOV
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject:      Re: Multiple record data problem and survival analysis
In-Reply-To:  <>
Content-type: text/plain; charset=US-ASCII

Neerav Monga <neerav.monga@GMAIL.COM> wrote: > I am stuck on this problem and was hoping someone can help. Here is the > problem: > > I have a varying number of records per patient, an age of diagnosis for > the disease and a diagnosis code number (i.e. ICD codes that start with > 153 or 154). I am interested in developing a survival analysis model > with time to disease as my main outcome. > > The issues that make this confusing are: a) each subject has multiple > icd codes, however I only want those that have a particular disease > (e.g. cancer) b) If a person has multiple cancers (ie. icd starts with > 153/154) I want to select the record with the youngest age of onset of > cancer (since I am predicting time to 1st cancer) c) some subjects do > not have cancer at all, yet have multiple records and I want to select > those with the youngest age (just to keep my coding rules consistant).

It seems to me that you have a harder problem than that.

[1] If a person can have multiple records with multiple ICD codes, then it seems to me that you can have patients who have at least one record before a cancer diagnosis. In that case, it seems to me that you WOULD NOT want the earliest record for the non-cancer patients. Can you go back and confer with others at your site and work out better selection rules?

[2] If you want a survival analysis model, then you have to also consider that the earlier record information for each patient might have a lot of diagnostic and modeling value. So perhaps you need the information of the other records as well. Now you have to decide on how to model this information and how you want to analyze the data.

HTH, David -- David Cassell, CSC Senior computing specialist mathematical statistician

"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> wrote on 05/09/2005 12:11:30 PM:

> Hi Everyone,

> Data Snapshot: > > subject agediagnosis icdcode > 4979 64 1841 > 4979 62 1741 > 3673 42 1820 > 3673 72 1539 > 1989 70 1531 > 1989 71 1889 > 2989 60 1531 > 2989 71 1549 > > I hope this is clear, my goal is to have one record per observation > fitting the various criteria i've explained. Thanks a lot for any > suggestions in advance. > > Cheers, > > Neerav

