LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2011, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 20 Sep 2011 20:58:13 +0000
Reply-To:   Mike Rhoads <RHOADSM1@WESTAT.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Mike Rhoads <RHOADSM1@WESTAT.COM>
Subject:   Re: Remove duplicate rows
Comments:   To: Michael Raithel <michaelraithel@WESTAT.com>
In-Reply-To:   <DCCDE0D83A1D0E43BB7001C29ED5B5CA06320635@EX10MAIL1.westat.com>
Content-Type:   text/plain; charset="us-ascii"

Hmmmm.

I have not played around with this much, and I agree with Toby and Michael that you need to sort by all variables. However, I didn't recall that you had to sort twice.

The 9.2 documentation for NODUPRECS states in part that, when this option is specified, "PROC SORT compares all variable values for each observation to the ones for the previous observation that was written to the output data set." To me, the part about making the comparison as the records are being written to the output data set suggests that the first sort is not necessary. (And if it is, I certainly hope SAS will improve the documentation at some point.)

Ref: http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000146878.htm#a003070995

Mike Rhoads RhoadsM1@Westat.com

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Michael Raithel Sent: Tuesday, September 20, 2011 4:29 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Remove duplicate rows

Dear SAS-L-ers,

Toby posted the following to Richard's interesting question:

> Richard.... > > I could be mistaken here but somewhere I remembered when you use > noduprec you have to sort it first by all the variables and then sort > it again with the noduprec as the duplicate records have to be > sequential in the data set. > Toby, Bingo; I was thinking the exact same thing! I was going to suggest (using Richard's example):

proc sort data=mess; by _all_; run;

proc sort data=mess noduprec dupout=mess_duplicates_removed; by _all_; run;

So, now we have a nomination and a second. Perhaps the motion passes. (Man, I've been living in the Washington, DC area for way too long)!

Toby, best of luck in all your SAS endeavors!

Take Care!

----MMMMIIIIKKKKEEEE (aka Michael A. Raithel)


Back to: Top of message | Previous page | Main SAS-L page