| Date: | Wed, 3 Oct 2007 07:58:28 -0700 |
| Reply-To: | "Terjeson, Mark" <Mterjeson@RUSSELL.COM> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | "Terjeson, Mark" <Mterjeson@RUSSELL.COM> |
| Subject: | Re: how to flag duplicates ? |
|
| In-Reply-To: | A<456B52C41B724C41B96561D7AD283E7D6C770A@mail.chpdm.umbc.edu> |
| Content-Type: | text/plain; charset="us-ascii" |
Hi Jack and Ram,
Here is one way to dynamically determine
the length of the vlist variable for the
data supplied:
* dynamically determine max length for vlist ;
proc sql noprint;
select max(lenpluscommas)-1 into :maxvlist
from (
select sum(length(var1)) as maxgrplen,
count(id) as grpcount,
calculated maxgrplen +
calculated grpcount as lenpluscommas
from have
group by id
);
quit;
%put maxvlist is >&maxvlist<;
then:
length vlist $100 duplicates $1;
becomes:
length vlist $&maxvlist duplicates $1;
Hope this is helpful.
Mark Terjeson
Senior Programmer Analyst, IM&R
Russell Investments
Russell Investments
Global Leaders in Multi-Manager Investing
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Jack Clark
Sent: Wednesday, October 03, 2007 7:23 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: how to flag duplicates ?
Ram,
Here is a possible DATA step solution. I set the concatenated
variable's length to $100 - you can adjust as appropriate. Or maybe
someone else can suggest a more dynamic way to manage the length.
data have;
infile cards missover;
input id var1 $2.;
cards;
1 xy
1 bc
1 cn
2 lm
2 lc
3 cm
3 cn
4 fr
5 gt
6 uy
run;
proc sort data = have;
by id var1;
run;
data need1 (keep=id vlist) need2 (keep=id var1 duplicates);
length vlist $100 duplicates $1;
set have;
by id;
retain vlist;
if first.id then vlist = var1;
else do;
duplicates = '1';
vlist = catx(', ',vlist,var1);
end;
if last.id then output need1;
output need2;
run;
title1 'First Format';
proc print data = need1;
var id vlist;
run;
title1 'Second Format';
proc print data = need2;
var id var1 duplicates;
run;
Jack Clark
Research Analyst
Center for Health Program Development and Management
University of Maryland, Baltimore County
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
ram.sasha@gmail.com
Sent: Wednesday, October 03, 2007 9:59 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: how to flag duplicates ?
I have a dataset in the following format:
id var1
1 xy
1 bc
1 cn
2 lm
2 lc
3 cm
3 cn
4 fr
5 gt
6 uy
I want the above dataset in 2 different formats:
format 1: concatenate all the var1 values into one variable
id var1
1 xy,bc,cn
2 lm ,lc
3 cm,cn
4 fr
5 gt
6 uy
format 2: requires a dup flag in the duplicate entries
id var1 duplicates
1 xy
1 bc 1
1 cn 1
2 lm
2 lc 1
3 cm
3 cn 1
4 fr
5 gt
6 uy
Thanks for the help
Ram
|