Date: Mon, 23 Oct 2006 08:24:13 -0700
Reply-To: "Pardee, Roy" <pardee.r@GHC.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Pardee, Roy" <pardee.r@GHC.ORG>
Subject: Re: proc sort nodupkey results
Content-Type: text/plain; charset="iso-8859-1"
Here's a sql solution (untested) that gets you all duped recs (e.g., both copies of each in one dset):
Proc sql ;
create table all_dupes as
select m.*, m2.num_recs
from my_table as m INNER JOIN
(select first_key_var, second_key_var, count(*) as num_recs
from my_table
group by first_key_var, second_key_var
having count(*) > 1
) as m2
on m.first_key_var = m2.first_key_var AND
m.second_key_var = m2.second_key_var
Quit ;
HTH,
-Roy
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Droogendyk, Harry
Sent: Saturday, October 21, 2006 7:41 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: proc sort nodupkey results
A follow-up data step will do what you require:
proc sort data = blah;
out = blah_sorted;
by sort_var;
run;
data blah_dups;
set blah_sorted;
by sort_var;
if not(first.sort_var and last.sort_var); run;
-----Original Message-----
From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu]
On Behalf Of data _null_;
Sent: 2006, October, 21 10:34 AM
To: plw213
Cc: SAS-L@listserv.uga.edu
Subject: Re: proc sort nodupkey results
As Arthur Tabachneck <art297@netscape.net> mentioned DUPOUT proc sort option may be what you want.
It does not give you all the dups just the obs that are excluded by NODUPKEY. Sort of like "if NOT first.rightMostByVar;" If you want all observations that have duplicate BY values you can use a data step and select all records "if NOT (First.(rightMostByVar) and Last.(rightMostByVar));
Notice the result of both DUPOUT and the data step method, in this example.
proc sort data=sashelp.class nodupkey out=work.test dupout=work.dups; by sex age; run; proc print data=work.test; proc print data=work.dups; run;
proc sort data=sashelp.class out=work.test2; by sex age; run; data work.dups2; set work.test2; by sex age; if not(first.age and last.age) then output; run; proc print; run;
On 10/21/06, plw213 <Paul.Leland@gmail.com> wrote:
> Does anyone know of a simple way to get the opposite results of using
> the nodupkey option with proc sort? In other words, I want all of the
> key-duplicated obs stored to a dataset.
> I have been able to accomplish this by perforfming several iterations
> of sorts and merges, but it is not very elegent. If there is a simple
> stright-forward solution, I would greatly appreciate hearing about it!
>
> Thanks, Paul W.
>
_______________________________________________________________________
This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations.
Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.
If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.
Ce courrier électronique est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent.
Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite.
Si vous recevez ce courrier électronique par erreur, veuillez m'en aviser immédiatement, par retour de courrier électronique ou par un autre moyen.