Date: Fri, 13 Apr 2012 15:09:50 -0700
Reply-To: Mark Miller <mdhmiller@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mark Miller <mdhmiller@GMAIL.COM>
Subject: Re: Identifying duplicates when certain condition
In-Reply-To: <BLU152-W45C96B2E1118610AF29FCFDE3B0@phx.gbl>
Content-Type: text/plain; charset=windows-1252; format=flowed
Alternatively, do it in one step by modifying the SORT
Proc Sort Data = Have
Out = Unique
Dupout = Dupes
NoDupKey;
By Current_Date Account_Number ;
Run;
Have -- still has original data
Unique -- has unique keys
Dupes -- duplicates
...Mark Miller
On 4/13/2012 2:22 PM, toby dunn wrote:
> Proc Sort
> Data = Have ;
> By Current_Date Account_Number ;
> Run ;
>
>
> Data Need ;
> Set Have ;
> By Current_Date Account_Number ;
>
> If Not ( First.Account_Number and Last.Account_Number ) ;
>
> Run ;
>
>
>
> Once you have this you have a data set with duplicate account numbers within each value of Current_Date.
>
>
> Toby Dunn
>
>
> If you get thrown from a horse, you have to get up and get back on, unless you landed on a cactus; then you have to roll around and scream in pain.
>
> “Any idiot can face a crisis—it’s day to day living that wears you out”
> ~ Anton Chekhov
>
>
>
>> Date: Fri, 13 Apr 2012 17:12:00 -0400
>> From: neilfrnnd@GMAIL.COM
>> Subject: Identifying duplicates when certain condition
>> To: SAS-L@LISTSERV.UGA.EDU
>>
>> Hi collegues,
>>
>> I have a data set like this. I want to see if there are duplicate values
>> in variable "Account_number" when the current_date=28FEB2010.
>>
>> data a;
>> Informat current_date date9.;
>> Input Current_date Account_number $ 11-15;
>> Format current_date date9.;
>> datalines;
>> 31JUL2010 10500
>> 31JUL2010 10500
>> 31JUL2010 200
>> 31JUL2010 300
>> 31JUL2010 400
>> 31JUL2010 2400
>> 31JUL2010 2400
>>
>> 28FEB2010 10500
>> 28FEB2010 10500
>> 28FEB2010 200
>> 28FEB2010 200
>> 28FEB2010 200
>> 28FEB2010 2400
>> 28FEB2010 100
>> ;
>> run;
>>
>>
>> I tried below but doesn't work. Could you please help.
>>
>>
>> Proc sort data=a out=temp;
>> by account_number;
>> run;
>>
>>
>> data temp2;
>> set temp;
>> by account_number;
>> if not (first.account_number and last.account_number) then output;
>> where current_date ='31JUL2010';
>> run;
>>
>> Help is greately apprecaited.
>> Mirisage
>
|