LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2008, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 1 May 2008 18:12:55 -0700
Reply-To:   jfh@stanfordalumni.org
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Jack Hamilton <jfh@STANFORDALUMNI.ORG>
Subject:   Re: Deduplicate tricky "Cancel" data
Comments:   To: MBWD <xmbwdx@GMAIL.COM>
In-Reply-To:   <dbfde497-4e18-47fc-93b9-ea94988f7fc9@c65g2000hsa.googlegroups.com>
Content-Type:   text/plain; charset="ISO-8859-1"

You can't get SORT SORT to do that. You would have to do a PROC SORT followed by a data step. I think you could also do it with PROC SQL, but I suspect it would be harder to follow and take longer to execute.

I'm not on a machine with SAS installed at the moment, so I can't try it, but my approach would be

- Sort the data set by all fields, with Cancel last.

- Use a double DOW-loop to process the data. In the first loop, determine whether you have a canceled group (using all the variables except Cancel in the BY statement). In the second loop, apply your findings to the output observations.

On Thu, 1 May 2008 15:09:45 -0700, "MBWD" <xmbwdx@GMAIL.COM> said: > I can't figure out how to get the below data to dedupe (sort of) by > deleting the two canceled observations AND their two counterparts (or > alternatively, just creating a binary variably 1,0 for those > observations that match the cancel row). > > This would delete all but rows 1 and 6. Rows 2 and 3 are counterparts > in a cancel transaction. So are rows 3 and 4. > > Account Name B Cancel Quantity Stock Price Entry Date Net > ABC1234 Fred S -175 XOM $89.00 12/1/2007 > ($15,575.00) > ABC1234 Fred S -175 XOM $89.00 1/5/2008 > ($15,575.00) > ABC1234 Fred S X 175 XOM $89.00 1/5/2008 > $15,575.00 > ABC1234 Fred S -175 XOM $89.00 1/8/2008 > ($15,575.00) > ABC1234 Fred S X 175 XOM $89.00 1/8/2008 > $15,575.00 > ABC1234 Fred B 105 ATT $25.00 12/1/2007 > $2,625.00 > > > In addition, it would be very helpful if someone know how to get the > PROC SORT DUPKEY method to NOT delete any observations, but rather > just mark as "1" and "0" those observations that it would have > deleted. > > Thanks for any and all help!!

-- Jack Hamilton Sacramento, California jfh@alumni.stanford.org


Back to: Top of message | Previous page | Main SAS-L page