Date: Fri, 6 Nov 1998 08:07:35 -0500
Reply-To: "Zack, Matthew M." <mmz1@CDC.GOV>
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: "Zack, Matthew M." <mmz1@CDC.GOV>
Subject: Re: Merging files--Should be concatenating files
Content-Type: text/plain; charset="iso-8859-1"
In reply to Fearless, who stated that
> I have two SAS datasets that I want to merge. Dataset 1 has
> 2524 rows and 86 variables. Dataset 2 has 131 rows and 42
> variables. All 42 variables of dataset 2 are variables
> found in Dataset 1 but none of the rows have matches. I
> want to merge the two files into a 3rd file deleting all
> variables that aren't in both files so that I have one file
> with 2655 rows and 42 variables.
>
> A straightforward merge does not solve the problem. It
> concatenates the two files but doesn't delete the variables
> found in one file but not in the other. I don't do this
> kind of stuff very often and I'm not finding anything
> terribly helpful in the helpfiles or the manuals I have at
> my disposal.
>
> Any suggestions on how best to proceed?
>
> Thanks.
>
> --
> Marc "Fearless" Feldesman
>
You actually want to concatenate or to append but not to merge the data set
with fewer variables on to the data set with more variables, keeping the
variables from the former in an output data set. One way with smaller
example data sets is listed below.
Matthew Zack
----------------------------------------------------------------------------
-------
options pageno=1 pagesize=54 linesize=77;
title1 "Concatenating two files with the same variables";
* Macro to append a smaller data set on to a larger dataset,;
* so that the latter contains only the variables in the former.;
%macro varselct(largedsn=, /* Data set with more variables */
smalldsn=, /* Data set with fewer variables */
dsnout= /* Output combined data set */);
option mprint;
%* If no output data set specified, let the data set;
%* with more variables be the output data set.;
%if ("&dsnout" eq "") %then %do;
%let dsnout=&largedsn;
%end;
* Select variable names from smaller data set;
proc contents data=&smalldsn noprint out=__names(keep=name);
run;
* Create macro variables for the names from the smaller data set;
data _null_;
retain nv 0;
if (eofile eq 1) then do;
call symput("nv",left(put(nv,8.)));
stop;
end;
set __names end=eofile;
nv=nv+1;
call symput("_v"||left(put(nv,8.)),left(name));
run;
* Delete intermediate data set.;
proc datasets nolist;
delete __names;
quit;
* Append smaller data set on to larger data set, with the latter;
* including only the variables in the former.;
%if ("&largedsn" eq "&dsnout") %then %do;
proc append data=&smalldsn
base=&largedsn(keep=
%do i=1 %to &nv;
&&_v&i
%end;
);
run;
%end;
%else %do;
proc append data=&largedsn(keep=
%do i=1 %to &nv;
&&_v&i
%end;
) base=&dsnout;
run;
proc append data=&smalldsn base=&dsnout;
run;
%end;
* Print final data set.;
proc print data=&dsnout;
title2 "Final data set, &dsnout";
run;
%mend varselct;
* Example of larger data set.;
data larger;
infile cards;
input a b c d;
output larger;
cards;
1 2 3 4
5 6 7 8
9 0 1 2
3 4 5 6
7 8 9 0
;
run;
* Print larger data set;
proc print data=larger;
title2 "Larger data set";
run;
* Example of smaller data set.;
data smaller;
infile cards;
input b d;
output smaller;
cards;
7 9
3 1
;
run;
* Print smaller data set;
proc print data=smaller;
title2 "Smaller data set";
run;
%varselct(largedsn=larger,smalldsn=smaller,dsnout=newdsn);
----------------------------------------------------------------------------
-------
The output from this program follows:
----------------------------------------------------------------------------
-------
Concatenating two files with the same variables
1
Larger data set 07:22 Friday, November 6,
1998
OBS A B C D
1 1 2 3 4
2 5 6 7 8
3 9 0 1 2
4 3 4 5 6
5 7 8 9 0
Concatenating two files with the same variables
2
Smaller data set
07:22 Friday, November 6,
1998
OBS B D
1 7 9
2 3 1
Concatenating two files with the same variables
3
Final data set, newdsn
07:22 Friday, November 6,
1998
OBS B D
1 2 4
2 6 8
3 0 2
4 4 6
5 8 0
6 7 9
7 3 1