LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 4 Sep 2007 15:49:41 -0400
Reply-To:   Richard Reeves <reeves@STUDENTCLEARINGHOUSE.ORG>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Richard Reeves <reeves@STUDENTCLEARINGHOUSE.ORG>
Subject:   Re: Many to many merge with dates
In-Reply-To:   A<200709041930.l84J3Mi1005826@mailgw.cc.uga.edu>
Content-Type:   text/plain; charset="us-ascii"

Depending on the size of your data you may want to consider a variation to the paper: http://www8.sas.com/scholars/Proceedings/2006/ETL/ET07_06.PDF It discusses the SyncJoin Algorithm for many to many merges. If you do use a proc SQL explore the optimizer features and the buffersize _method and _tree options. Rich

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@listserv.vt.edu] On Behalf Of Arthur Tabachneck Sent: Tuesday, September 04, 2007 3:30 PM To: SAS-L@LISTSERV.VT.EDU Subject: Re: Many to many merge with dates

Since the data step can't handle many-to-many merges, I'd suggest a SQL approach. Is something like the following what you are looking for?

data First_Dataset; input ID @9 Firstdate date9.; format Firstdate date9.; cards; 1 20Jan2002 1 18Aug2002 1 30May2002 2 15Mar2002 2 20Apr2002 2 18May2002 ;

data Second_Dataset; input ID @9 Seconddate date9. type; format Seconddate date9.; cards; 1 16Jan2002 2 1 18Jan2002 3 1 24May2002 2 1 1Jun2002 2 1 1Aug2002 2 1 16Aug2002 2 1 30Aug2002 2 2 10Mar2002 3 2 12Mar2002 4 2 20Mar2002 4 2 16Apr2002 3 2 18Apr2002 3 2 28Apr2002 3 2 14May2002 3 2 16May2002 3 2 26May2002 3 ;

proc sql; create table want as select * from (select f.ID, Firstdate, Seconddate, type from First_dataset as f join Second_dataset as s on f.id eq s.id having Firstdate gt Seconddate and Firstdate-20 LE Seconddate) group by id,FirstDate having Firstdate-Seconddate eq min(Firstdate-Seconddate); quit;

Art -------- On Tue, 4 Sep 2007 08:38:50 -0700, phrising@YAHOO.DK wrote:

>Dear fellow SAS-users, >I am seeking a solution to the following problem. I have a dataset >with multiple occurrences of the same ID and a firstdate > >ID Firstdate >1 20Jan2002 >1 18Aug2002 >1 30May2002 >2 15Mar2002 >2 20Apr2002 >2 18May2002 > >I have another dataset with an ID variable, a date and a type: > >ID Seconddate Type >1 16Jan2002 2 >1 18Jan2002 3 >1 24May2002 2 >1 1Jun2002 2 >1 1Aug2002 2 >1 16Aug2002 2 >1 30Aug2002 2 >2 10Mar2002 3 >2 12Mar2002 4 >2 20Mar2002 4 >2 16Apr2002 3 >2 18Apr2002 3 >2 28Apr2002 3 >2 14May2002 3 >2 16May2002 3 >2 26May2002 3 > >I want to end up with a dataset with the closest date (seconddate) >that match the firstdate and the date has to come before the first >date. The final dataset will look like this: > >ID Firstdate Type >1 20Jan2002 3 >1 18Aug2002 2 >1 30May2002 3 >2 15Mar2002 4 >2 20Apr2002 3 >2 18May2002 3 > >The problem is that ID occurs multiple times in both datasets. I >probably also have to set up some kind of condition as to how far back >I want to go to find the closest date: >Firstdate-20 <= Seconddate <= Firstdate > >Any hints/tips as to how to do this appreciated.


Back to: Top of message | Previous page | Main SAS-L page