Date: Wed, 12 Jan 2005 23:57:13 -0500
Reply-To: "Richard A. DeVenezia" <radevenz@IX.NETCOM.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Richard A. DeVenezia" <radevenz@IX.NETCOM.COM>
Subject: Re: Transposing Variables
Elly Hodges wrote:
> Hi Everyone,
>
> I need some help with some SAS code. Let me show you an example of the
> data before I get to my specific question. The names of the variables
> and the actually response of the variable is unimportant here.
>
> ID Var1 Var2 Num
> 1 AA AB 1
> 1 C AC 2
> 1 AB AA 3
> 1 AD AF 4
> 2 YU KO 5
> 2 KO YU 6
> .......
>
> I need to somehow eliminate duplicates. It doesn't appear that there
> are duplicates but there are. For example, for subject ID=1, numbers
> 1 & 3 are exactly the same except that the response listed under each
> variable is reversed. Also, ID=2, the 2 lines are the same. I need to
> delete one of these lines that is the duplicate, and it doesn't
> matter which one is deleted. There are more than 1 million
> observations in my data set, so I can't easily just look at the data
> and delete the duplicates.
>
> Any suggestions on how to do this? I know I need to somehow sort
> this, but not sure how.
This bit-o-sql will create a deduped table.
data foo;
input ID Var1 $ Var2 $ Num;
cards;
1 AA AB 1
1 C AC 2
1 AB AA 3
1 AD AF 4
2 YU KO 5
2 KO YU 6
run;
data foov / view=foov;
set foo;
rowid = _n_;
run;
proc sql;
create table fooDeDupe as
select * from foov
group by
case
when var1<var2 then catx('.',var1,var2)
else catx('.',var2,var1)
end
having rowid=min(rowid)
order by rowid
;
quit;
--
Richard A. DeVenezia
http://www.devenezia.com/
|