LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2008, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 3 Oct 2008 15:54:07 -0500
Reply-To:     Mary <mlhoward@avalon.net>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mary <mlhoward@AVALON.NET>
Subject:      Re: Comparing 2 data sets (easy one again!)
Comments: To: Toby Dunn <tobydunn@HOTMAIL.COM>
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
              reply-type=original

Yes, clearly if one has too many results then my code could have also been modified to put the results into a data set:

proc sql noprint; create table ids as select ID from set1 where ID not in (select ID from set2); quit;

But yours is another way to do it; I wonder if it is faster since it needs to do a left join and that's generally slower, but it is certainly an alternative. The user was just asking for ID's but you could select all the data with the *.

Thanks.

-Mary

----- Original Message ----- From: Toby Dunn To: SAS-L@LISTSERV.UGA.EDU Sent: Friday, October 03, 2008 3:32 PM Subject: Re: Comparing 2 data sets (easy one again!)

Mary,

The method you posted will work only for small data sets where the unique number of ID's will fit into the maximum length of a macro variable.

Proc Compare comes to mind as a solution.

If one wanted to do this via SQL then try:

Proc SQL ; Create Table Need As Select One.* From One As One Left Join Two As Two On One.ID = Two.ID Where Missing( Two.ID ) ; Quit ;

The above is called an Anti-Join and yes the Where clause is correct. Actually one could have used a Having but in some testing I did the other day it performed slighty worse than the Where clause. Either way they both produce the same results.

On Fri, 3 Oct 2008 10:16:14 -0500, Mary <mlhoward@AVALON.NET> wrote:

>Using the Not in with a select on on the other data set can do this: > >proc sql noprint; >select ID into: VarList >separated by " " >from set1 >where ID not in >(select ID from set2); >quit; >%put &varlist; > >-Mary > > >----- Original Message ----- >From: Mike Smith >To: SAS-L@LISTSERV.UGA.EDU >Sent: Friday, October 03, 2008 9:58 AM >Subject: Comparing 2 data sets (easy one again!) > > >Hi all SAS experts, > >I am trying to manage two big data sets and trying to figure out which >caseids of one dataset could be found /not found in the other dataset. > >Is there like a procedure aside from just sorting both datasets and >figuring out which ids arent there? > > >Thanks a lot!


Back to: Top of message | Previous page | Main SAS-L page