LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2005, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 3 Nov 2005 11:25:13 -0800
Reply-To:     Dale McLerran <stringplayer_2@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Dale McLerran <stringplayer_2@YAHOO.COM>
Subject:      Re: Compare two tables - Modified
In-Reply-To:  <200511031744.jA3HVQgY008665@malibu.cc.uga.edu>
Content-Type: text/plain; charset=iso-8859-1

--- SUBSCRIBE SAS-L Chandra Gadde <ddraj2015@GMAIL.COM> wrote:

> Hi All, > > Here I am simplyfing my problem. I have two tables t1 and t2; Here is > the > SAS code. > > data t1; > input x y; > cards; > 11 111 > 22 222 > 33 333 > 44 444 > 55 555 > 66 666 > ; > run; > > data t2; > input x y; > cards; > 22 222 > 44 444 > 66 666 > 11 110 > 33 330 > 54 553 > ; > run; > > proc sort data = t1; > by x y; > run; > proc sort data = t2; > by x y; > run; > > proc compare base = t1 compare = t2 brief outnoequal; > run; > > The output for this is .......... > > > > The COMPARE Procedure > Comparison of WORK.T1 with WORK.T2 > (Method=EXACT) > > NOTE: Values of the following 2 variables compare unequal: x y > > > Value Comparison Results for Variables > > > __________________________________________________________ > || Base Compare > Obs || x x Diff. % > Diff > ________ || _________ _________ _________ > _________ > || > 5 || 55.0000 54.0000 -1.0000 > -1.8182 > > __________________________________________________________ > > > > The COMPARE Procedure > Comparison of WORK.T1 with WORK.T2 > (Method=EXACT) > > Value Comparison Results for Variables > > > __________________________________________________________ > || Base Compare > Obs || y y Diff. % > Diff > ________ || _________ _________ _________ > _________ > || > 1 || 111.0000 110.0000 -1.0000 > -0.9009 > 3 || 333.0000 330.0000 -3.0000 > -0.9009 > 5 || 555.0000 553.0000 -2.0000 > -0.3604 > > __________________________________________________________ > > > Here is my question. > > I just need to create this output into another dataset with 5 columns > as > shown here. Any help would be greatly appreciated. I have been trying > to > achieve this but couldn't succed. Please help me out. > > Row t1_Colname t2_Colname t1_value t2_value > 5 x x 55 54 > 1 y y 111 110 > 3 y y 333 330 > 5 y y 555 553 > > Thanks, >

Chandra,

I thought to use an ODS OUTPUT statement with PROC COMPARE. But if you do that, you do not get an output data set which has the fields you would like. Instead, you get an output data set which has all of the above information (and more) in a single text variable. That text variable would have to be parsed. It could be done, but the following approach is simpler for me to code and generates what you display above.

Since you stated previously that you have multiple columns with the same names in your two data sets plus some additional columns on one or the other set which differ, I will write general purpose code that assumes no knowledge of the names of the variables in the data sets to be compared. So, we first use proc contents to find the names of the variables in each data set. We subsequently merge the contents to construct a data set which identifies variables common to both data sets.

We then use the list of common variables to write code which

1) merges the two data sets 2) renames the common variables in each data set so that the names are unique and one set does not overwrite the other 3) loops over the renamed variables comparing their values for the common variables 4) writes out the required information when the values of any two variables which should be the same differ between the two sets

/* Determine names of variables in each data set */ proc contents data=t1 out=t1_Colnames noprint; proc contents data=t2 out=t2_Colnames noprint; run;

/* Determine variables which are common to both data sets */ data Vboth; merge t1_Colnames(in=a) t2_Colnames(in=b); by name; if a & b; run;

/* Now write code which */ /* 1) merges the two data sets, */ /* 2) renames the variables during the merge */ /* 3) loops over the renamed variables comparing their values */ /* 4) write out the required information for values which differ */ data _null_; set Vboth end=lastrec; array vars {100000} $ 32 _temporary_; vars{_n_} = name; if lastrec then do;

/* Start writing code. */ call execute("data diffs;"); call execute(" length Row 3"); call execute(" t1_Colname t2_Colname $ 32"); call execute(" t1_value t2_value 8;");

/* Merge data sets and perform variable renames */ call execute(" merge t1(in=a rename=("); do i=1 to _n_; /* variable renames for first data set*/ call execute(vars{i} || compress("=t1_"||vars{i})); end; call execute(" ))"); call execute(" t2(in=a rename=("); do i=1 to _n_; /* variable renames for second data set*/ call execute(vars{i} || compress("=t2_"||vars{i})); end; call execute(" ));");

/* Construct arrays of common variables from each data set */ call execute(" array t1_vals {*} "); /* Data set 1 list */ do i=1 to _n_; call execute(compress("t1_"||vars{i})); end; call execute(" ;");

call execute(" array t2_vals {*} "); /* Data set 2 list */ do i=1 to _n_; call execute(compress("t2_"||vars{i})); end; call execute(" ;");

call execute(" Row=_n_;"); /* Loop over common variables and check for equality */ call execute(" do i=1 to dim(t1_vals);"); call execute(" if t1_vals{i}^=t2_vals{i} then do;"); /* Inequality found for i-th variable in list */ /* Return name of i-th variable */ call execute(" t1_Colname=substr(vname(t1_vals{i}),4);"); call execute(" t2_Colname=t1_Colname"); /* Return values of i-th variable from each data set */ call execute(" t1_value=t1_vals{i};"); call execute(" t2_value=t2_vals{i};"); /* Output record of difference */ call execute(" output;"); call execute(" end;"); call execute(" end;"); call execute(" keep Row t1_Colname t2_Colname t1_value t2_value;"); call execute("run;");

/* Sort the data by variable and row in which difference is found */ call execute("proc sort data=diffs;"); call execute(" by t1_Colname Row;"); call execute("run;"); end; run;

/* Now print our list of differences */ proc print data=diffs; run;

If you submit this code and then look at the log file where all of the call execute statements have been resolved, I think you will better understand just what this code does.

HTH,

Dale

--------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------

__________________________________ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com


Back to: Top of message | Previous page | Main SAS-L page