Date: Thu, 23 Mar 2000 15:25:38 -0600
Reply-To: "Foy, Thomas M." <foytho@HSMNET.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Foy, Thomas M." <foytho@HSMNET.COM>
Subject: Episode of Coverage
Content-Type: text/plain; charset="iso-8859-1"
To anyone who will listen:
I am trying to construct episodes of coverage for patients in a data base.
By episode of coverage I mean a continuous period of time that a patient was
covered by an insurance provider. The main source of frustration in this
venture is that I have multiple start dates and end dates for each person.
The data looks something like this:
Name Start Date End Date
Warren Pease 01mar1994 31jul1994 \
Warren Pease 01mar1994 31jul1994 / the
duplication is legitimate
Warren Pease 01aug1994 28feb1994
Warren Pease 01mar1995 29feb1996
Warren Pease 01mar1995 29feb1996
Warren Pease 01mar1996 31oct1996
Warren Pease 01nov1996 31mar1997
Warren Pease 01nov1996 31mar1997
Warren Pease 01aug1997 31may1998
Warren Pease 01aug1997 31may1997
Sandy Beech 01feb1995 31dec1995
Sandy Beech 01jan1996 31oct1996
Sandy Beech 01jun1996 31dec1996 ---- some coverage
periods are subsets of a larger coverage period
Sandy Beech 01nov1996 31dec1996
Sandy Beech 01jan1997 31jan1997
Sandy Beech 01jan1997 31july1997
Sandy Beech 01aug1997 .
Sandy Beech 01aug1997 . missing
means that at the time the sample was taken their
Sandy Beech 01aug1997 . coverage
had not ended.
And the list continues with Norman Conquest, Guy Wire, Sara Bellum etc.
What I need to do is sort through each person and construct one, or more,
periods of continuous coverage starting with the earliest start date and
ending with the latest end date. The catch is this, if there is a break of
more than 45 days, there is to be a new episode of coverage. If coverage
ends one day and starts the next day, there is no interruption is coverage.
For example: Suppose Xavier Ownassez has coverage starting on 01jan1994
and ending on 31dec1994, then starting on 01jan1995 and ending on 31dec1995,
then starts again on 01jan1996 and ends on 30jun1996, then starts again on
01nov1996 and ends on 31jan1997. In tabular form it would look like this:
Start End
01jan1994 31dec1994
01jan1995 31dec1995
01jan1996 30jun1996
01nov1996 31jan1997
What I want to end up with is two episodes of coverage: 01jan1994 to
30jun1996, and 01ov1996 to 31jan1997. Am I making myself clear.
There is duplication in the start and end dates, and there are coverage
periods that are subsets of other coverage periods. It's supposed to be
that way. Why? I don't really know. I have 37 variables in each record,
678,000+ individuals, 1,000,000+ records, and each record is unique. I just
need to construct episodes of coverage based on the start and end dates, for
now. There's more for later.
Any assistance that anyone is willing to give is much, much, much
appreciated. I've used up three horses worth of glue trying NOT to come
unglued, and my feet are getting sore from climbing the walls.
Thanks in advanced,
Thomas M. Foy
Sr. Programmer/Analyst
HealthSystem Minnesota
Minneapolis, Minnesota
foytho@hsmnet.com