| Date: | Sat, 3 Jan 2004 02:50:03 GMT |
| Reply-To: | Arthur Tabachneck <art297@NETSCAPE.NET> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | Arthur Tabachneck <art297@NETSCAPE.NET> |
| Subject: | Re: Merge for duplicate records |
|---|
Sigurd's solution gives you what you were looking for but, as usual, non-sql
solutions are also available. I've listed one, below, but first want to
point out a problem of which you should be aware, namely that you are
attempting to merge records where duplicates exist in both files.
As a result, you are probably not getting the desired resulting file.
Art
data tmp last;
retain count;
merge data_A (in=tv) data_B (in=td);
by ID DT;
if first.dt then count=1;
else count+1;
pattern = tv + 10*td;
output tmp;
if last.dt then output last;
run;
proc print data=last;
run;
----------
"J. Man" <jman_200045@hotmail.com> wrote in message
news:db3fd05.0401021249.1a4d9e13@posting.google.com...
> How do I get the frequency of duplicate records for BY variables. In
> the following example, I prgram to give "data_A has THREE observations
> for 102-06Aug01 and data_B has TWO records for the same BY ID DT."
> similarly for other records.
>
> Am I mising an option ? Any pointer appreciated and greetings for the
> New Year to all.
>
> J. Man
>
> **** program starts;
>
> data data_A;
> input ID DT $ VNO $;
> cards;
> 101 25May01 v1
> 102 30Jul01 v2
> 102 30Jul01 v3
> 102 06Aug01 v4
> 102 06Aug01 v5
> 102 06Aug01 v6
> ;
> run;
>
> data data_B;
> input ID DT $ DNO $;
> cards;
> 102 30Jul01 d1
> 102 06Aug01 d2
> 102 06Aug01 d3
> 102 13Aug01 d4
> 103 24Aug01 d5
> ;
> run;
>
> proc sort data=data_A;
> by ID DT;
> run;
> proc sort data=data_B;
> by ID DT;
> run;
>
> data tmp;
> merge data_A (in=tv) data_B (in=td);
> by ID DT;
> pattern = tv + 10*td;
> run;
>
> proc print data=tmp;
> run;
|