LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (July 2008, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 24 Jul 2008 23:13:40 -0400
Reply-To:   "Howard Schreier <hs AT dc-sug DOT org>" <schreier.junk.mail@GMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Howard Schreier <hs AT dc-sug DOT org>" <schreier.junk.mail@GMAIL.COM>
Subject:   Re: Selectively delete duplicates

On Thu, 24 Jul 2008 16:23:05 -0700, Matt <mmandel2@NYC.RR.COM> wrote:

>I have a table with many duplicate key values (se_no). When I >encounter a duplicate, I need to determine which one to keep based on >the values in id1 and id2. The optimal observation is one in which >both values are 0; next best is 0 and 6; last is 6 and 0. I need a >method for reading in each observation until I reach one that I can >write to a new dataset. Here's a sample of the data: > >input se_no $10. id1 id2 >; >datalines; >2200156354 . . >2200156446 0 0 >2200156446 0 6 >2200156446 0 6 >2200156446 6 6 >2200156453 . . >2200156495 0 6 >2200156495 6 6 >2200156495 0 6 >2200156495 6 6 >2200156511 . . >; > >I would greatly appreciate any suggestions from you experts out there. > >Thanks! >Matt

This SQL solution seems rather natural:

proc sql; create table test2(drop = opt) as select distinct * , case cats(id1,id2) when '00' then 'A' when '06' then 'B' when '60' then 'C' when '66' then 'D' else 'Z' end as opt from test group by se_no having opt = min(opt); quit;


Back to: Top of message | Previous page | Main SAS-L page