|
If the join is performed based only on C1, then it'll be a many-many join,
in which SQL produce all possible combinations of that BY group. So doing a
full join based on all variables, will include the original rows from A and
any new rows from B. Revised SQL code:
Proc sql feedback;
select coalesce(a.c1,b.c1) as c1, coalesce(a.p2,b.p2) as p2,
coalesce(a.d3,b.d3) as d3, coalesce(a.e4,b.e4) as e4
from a full join b
on a.c1 = b.c1 and a.p2=b.p2 and a.d3=b.d3 and a.e4=b.e4
order by 1,2,3;
Quit;
Akshaya
On Tue, Nov 25, 2008 at 2:08 PM, morfar <mofa@swift.se> wrote:
> is there an easy way to run this and not end up with duplicate rows, which
> then have to be separately deduped?
>
> the goal is to get all new and different items from "Data b" without losing
> the data from "Data a"
>
> data a;
> input c1 $ p2 $ d3 $ e4 $;datalines;
> 1001 601 010108 A1
> 1001 501 020108 B1
> 1001 401 030108 C1
> ;;;;
> run;
> data b;
> input c1 $ p2 $ d3 $ e4 $;datalines;
> 1001 601 010108 A1
> 1001 501 020108 B1
> 1001 401 030108 C1
> 1001 301 010108 A1
> 1001 201 020108 T2
> 1001 101 030108 XX
> 1001 901 030108 XX
> ;;;;
> run;
> proc sql;create table joined as select a.*,b.*
> from a a full join b b
> on a.c1 = b.c1
> order b.p2, b.d3;
> quit;
>
|