Date: Mon, 13 Oct 1997 23:37:43 GMT
Reply-To: LPogoda <lpogoda@AOL.COM>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: LPogoda <lpogoda@AOL.COM>
Organization: AOL http://www.aol.com
Subject: Re: Merge and interleave simultaneously?
In previous replies under this heading, I suggested a couple of approches
using a three-way merge. Various deficiencies in my "solutions" were
correctly pointed out to me, so I'd like to try again.
The original question (which I've lost) asked how to interleave two data
sets while simultaneously appending data from a third data set to the
interleaved observations. Because of the size of the original data sets,
the questioner wanted to avoid using intermediate data sets if at all
possible. So first interleaving the two data sets and merging the result
with the third was not an acceptable solution.
One way to get a result would be by:
proc sort data = men;
by occ;
proc sort data = women;
by occ;
proc sort data = occ;
by occ;
data all (keep = id occ desc);
merge men (in = men) occ (in = occupy);
by occ;
if men and occupy then output;
merge women (in = women) occ (in = occupy);
by occ;
if women and occupy then output;
run;
The assumption here is that if an observation on either men or women had
an occupation code not on occ the observation is invalid, and we don't want
that observation in the output.
This would interleave an observation from men, followed by an observation
for women, each observation with the appended data from occ. This is not
quite the same result that would be obtained by first interleaving men and
women by occ, which would have all the observations in men grouped by a
particular occ followed by all observations from women grouped by a
particular occ.
If that particular result is important, the last step could be replaced by:
data all;
length desc $ 12;
retain desc;
set occ (in = occupy rename = (desc = desc1))
men (in = men)
women (in = women);
by occ;
if occupy then desc = desc1;
if men or women then output;
run;
This step ASSUMES that all occupation codes that could appear on both the
men and women data sets are in the occ data set. If that's not the case
then such observations would have the wrong occupation description on the
output data set. I don't remember if such an assumption is warranted or
not.