LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2007, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 3 Oct 2007 11:15:15 -0400
Reply-To:   Jack Clark <JClark@CHPDM.UMBC.EDU>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Jack Clark <JClark@CHPDM.UMBC.EDU>
Subject:   Re: how to flag duplicates ?
Comments:   To: "Terjeson, Mark" <Mterjeson@russell.com>, Ram Kumar <ram.sasha@gmail.com>
In-Reply-To:   <16FD64291482A34F995D2AF14A5C932C015A78D1@MAIL002.prod.ds.russell.com>
Content-Type:   text/plain; charset="us-ascii"

Mark,

I like the solution. To be compatible with the code I provided, a slight modification is necessary because the separator I was using was ', ' - not ','. The extra space between values of VAR1 is the difference.

Ram - if you don't need/want the space between values of VAR1, eliminate it in the CATX or COMPBL function. If you do want it, a minor change to Mark's nice solution should work...

* dynamically determine max length for vlist ; proc sql noprint; select max(lenpluscommas)-2 into :maxvlist from ( select sum(length(var1)) as maxgrplen, count(id) as grpcount, calculated maxgrplen + (calculated grpcount)*2 as lenpluscommas from have group by id ); quit;

%put maxvlist is >&maxvlist<;

Jack Clark Research Analyst Center for Health Program Development and Management University of Maryland, Baltimore County

-----Original Message----- From: Terjeson, Mark [mailto:Mterjeson@russell.com] Sent: Wednesday, October 03, 2007 10:58 AM To: Jack Clark; SAS-L@LISTSERV.UGA.EDU; ram.sasha@gmail.com Subject: RE: Re: how to flag duplicates ?

Hi Jack and Ram,

Here is one way to dynamically determine the length of the vlist variable for the data supplied:

* dynamically determine max length for vlist ; proc sql noprint; select max(lenpluscommas)-1 into :maxvlist from ( select sum(length(var1)) as maxgrplen, count(id) as grpcount, calculated maxgrplen + calculated grpcount as lenpluscommas from have group by id ); quit; %put maxvlist is >&maxvlist<;

then: length vlist $100 duplicates $1; becomes: length vlist $&maxvlist duplicates $1;

Hope this is helpful.

Mark Terjeson Senior Programmer Analyst, IM&R Russell Investments

Russell Investments Global Leaders in Multi-Manager Investing

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Jack Clark Sent: Wednesday, October 03, 2007 7:23 AM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: how to flag duplicates ?

Ram,

Here is a possible DATA step solution. I set the concatenated variable's length to $100 - you can adjust as appropriate. Or maybe someone else can suggest a more dynamic way to manage the length.

data have; infile cards missover; input id var1 $2.; cards; 1 xy 1 bc 1 cn 2 lm 2 lc 3 cm 3 cn 4 fr 5 gt 6 uy run;

proc sort data = have; by id var1; run;

data need1 (keep=id vlist) need2 (keep=id var1 duplicates); length vlist $100 duplicates $1; set have; by id; retain vlist; if first.id then vlist = var1; else do; duplicates = '1'; vlist = catx(', ',vlist,var1); end; if last.id then output need1; output need2; run;

title1 'First Format'; proc print data = need1; var id vlist; run;

title1 'Second Format'; proc print data = need2; var id var1 duplicates; run;

Jack Clark Research Analyst Center for Health Program Development and Management University of Maryland, Baltimore County

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of ram.sasha@gmail.com Sent: Wednesday, October 03, 2007 9:59 AM To: SAS-L@LISTSERV.UGA.EDU Subject: how to flag duplicates ?

I have a dataset in the following format:

id var1 1 xy 1 bc 1 cn 2 lm 2 lc 3 cm 3 cn 4 fr 5 gt 6 uy

I want the above dataset in 2 different formats: format 1: concatenate all the var1 values into one variable id var1 1 xy,bc,cn 2 lm ,lc 3 cm,cn 4 fr 5 gt 6 uy

format 2: requires a dup flag in the duplicate entries

id var1 duplicates 1 xy 1 bc 1 1 cn 1 2 lm 2 lc 1 3 cm 3 cn 1 4 fr 5 gt 6 uy

Thanks for the help Ram


Back to: Top of message | Previous page | Main SAS-L page