Date: Thu, 21 Oct 1999 16:41:35 GMT0BST
Reply-To: Panos PAPANIKOLAOU <PapanikolaouP@CARDIFF.AC.UK>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Panos PAPANIKOLAOU <PapanikolaouP@CARDIFF.AC.UK>
Subject: Re: merging problems.
In-Reply-To: <5187AD6008B7D011B9D500805FFE5181014FCDBF@ner-msg11.midatl.mccaw.com>
Dear All,
I have lately received lots of valuable assistance from you all on
how to deal with a fews SAS questions and thank you very much
for helping me. Fortunately things move smoothly and so i have
moved to the next stage of my analysis which is merging two
different data-sets. More specifically, i would like to merge two data-
sets but it seems that conventional merge codes may not be
applicable straightforward to this case and i am explaining this
situation below.
1. I have large data-set - call it MASTER. I have such info as:
a) The central registration number --> CRN, like an ID variable
b). Surname and forename of each patient (Two separate vars)
c). Sex of the patient
d). Admission and discharge date of each episode.
This simply means that this data-set has lots of duplicate cases
(where duplicate is defined by the appearance of the same ID
more than once.) In other words, the same patient stays in
the clinic A and so this is recorded as ONE episode and then the
patient is discharged to (namely, the person is trasnferred to)
another clinic and so this is recorded as ANOTHER episode.
I then collected information from medical records, entred in excel
and lastly imported in SAS. Let's call this 2nd data-set as TAKEN.
For this data-set i have such vars as:
1. CRN,
2. Surname, Forename
3. Admission and Discharge date. Due to entry errors these two
variables are droped from the TAKEN data.
4. Gender of patient
Please note that this data-set -- TAKEN -- has also Duplicate
cases as details for TWO different EPISODES (but the same
patient
and so with the same CRN, SURNAME, FORENAME) have been
collected.
I want to merge TAKEN with MASTER using CRN and perhaps
Surname, Forename (or/and maybe sex) with TAKEN. The new
data-set should contain as N observations as the data-set TAKEN.
That is, if TAKEN has 300 cases, the new data set also has to
have 300 cases. In addition, this new data-set has to have all
collumns of the large data-set, MASTER plus the collumns of
TAKEN, exempt of course of any common variables.
In terms of matrix dimension we have:
MASTER: N1xK
TAKEN N2xZ, where, N1 is bigger than N2
NEW (the new data-set): N2xW, where W consists of K, Z minus
any variables in common.
That is the key problem that I need to solve and I would appreciate
very much if you would suggest me any ideas on how to take it
further.
Thank you very much indeed for taking the time to consider my
request. I look forward to hearing from you.
Regards
Panos