LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2005, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 8 Nov 2005 10:14:26 -0500
Reply-To:     "Michael S. Zdeb" <msz03@HEALTH.STATE.NY.US>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Michael S. Zdeb" <msz03@HEALTH.STATE.NY.US>
Subject:      Re: How to make repeats in merge-by-group to work?
Comments: To: Number_42@CARAMAIL.COM
In-Reply-To:  <1131450900.655635.242990@g43g2000cwa.googlegroups.com>
Content-type: text/plain; charset=US-ASCII

Hi...I agree with Toby Dunn...use PROC SQL. But, for those of us who used SAS prior to V6, there was no PROC SQL and many-to-many merges still had to be done with a data step. For those of us who remember NESUG '88 (or for that matter, 1988), the proceedings contain a paper by Robina Thornton and Merry Rabb on "Advanced SET and MERGE Processing" that shows how to do many-to-many merging in a data step merge. The method does what PROC SQL does in the background, i.e. it ensures that all observations in data sets one and two with the same ID "see" each other at some point in the data step processing. So, here it is (from NESUG '88) with the posted data and it does give the desired result...

data one; input id $ fruit $; datalines; a apple a apple b banana c coconut c coconut c coconut ;

proc sort data=one; by id; run;

data two; input id $ color $; datalines; a amber b brown b black c cocoa c cream ;

proc sort data=two; by id; run;

* create POINTERS for both data sets - reduces the data step merge to one-to-one merging;

data one_point (keep=id one_start one_end); retain one_start; set one; by id; if first.id then one_start = _n_; if last.id then do; one_end = _n_; output; end; run;

data two_point (keep=id two_start two_end); retain two_start; set two; by id; if first.id then two_start = _n_; if last.id then do; two_end = _n_; output; end; run;

* merge the POINTER data sets; * use the point information to cycle through observations with matching IDs;

data one_two (keep=id fruit color); merge one_point two_point; by id; do i = one_start to one_end; set one point=i; do j = two_start to two_end; set two point=j; output; end; end; run;

proc print data=one_two; run;

Gives...

Obs id fruit color 1 a apple amber 2 a apple amber 3 b banana brown 4 b banana black 5 c coconut cocoa 6 c coconut cream 7 c coconut cocoa 8 c coconut cream 9 c coconut cocoa 10 c coconut cream

PS Another posted solution...

DATA test(drop=id2 ); set one; do i=1 to nn; set two(rename=(id=id2)) nobs=nn point=i; if id=id2 then output; end; RUN;

also works but requires a complete pass through data set two for each observation in data set one rather than only looking within observations with the same ID.

Mike Zdeb U@Albany School of Public Health 1 University Drive Rensselaer, NY 12144-3456 (P)518-402-6479 (F)630-604-1475

Anonymous user <Number_42@CARAMA IL.COM> To Sent by: "SAS(r) SAS-L@LISTSERV.UGA.EDU Discussion" cc <SAS-L@LISTSERV.U GA.EDU> Subject How to make repeats in merge-by-group to work? 11/08/2005 06:55 AM

Please respond to Anonymous user <Number_42@CARAMA IL.COM>

Hello,

i have some datasets with repeats in the by group and it's not a problem. I would like SAS to handle it properly. Let's start from what SAS do:

DATA one; INPUT id $ fruit $; DATALINES; a apple a apple b banana c coconut c coconut c coconut ;

PROC SORT data=one; BY id; RUN;

DATA two; INPUT id $ color $; DATALINES; a amber b brown b black c cocoa c cream ;

PROC SORT data=two; BY id; RUN;

DATA test; MERGE one two; BY id; RUN;

PROC PRINT; RUN;

/* Obs id fruit color

1 a apple amber 2 a apple amber 3 b banana brown 4 b banana black 5 c coconut cocoa 6 c coconut cream 7 c coconut cream */

Now what i need is that: 1 a apple amber 2 a apple amber 3 b banana brown 4 b banana black 5 c coconut cocoa 6 c coconut cocoa 7 c coconut cocoa 8 c coconut cream 9 c coconut cream 10 c coconut cream

It migth sound useless here but in my real data, i have other columns which have different values but are not keys.

I know how to do this with a proc sql, but is it possible to do it in "pure SAS"?

TIA


Back to: Top of message | Previous page | Main SAS-L page