Date: Tue, 22 Jul 2003 10:07:49 -0400
Reply-To: Ian Whitlock <WHITLOI1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <WHITLOI1@WESTAT.COM>
Subject: Re: Merge bug
Content-Type: text/plain
Michael,
Here is an example divorced from your problem.
data w ;
x = "abc " ;
output ;
x = "abc" || "00"x ;
output ;
run ;
data _null_ ;
set w ;
put x= ;
run ;
data _null_ ;
set w ;
put x= $hex8. ;
run ;
Note that the second step makes the two values of X look the same. The
third step reveals the difference.
Subtle problems are caused by mismatching lengths. That is why I suggested
first eliminating such problems.
IanWhitlock@westat.com
-----Original Message-----
From: Michael Murff [mailto:MurffMJ@LDSCHURCH.ORG]
Sent: Monday, July 21, 2003 4:54 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Merge bug
I'm almost certain that the logic is okay. I tried Ron's syntax
DATA inCOP inGART inBOTH;
merge cop (IN = fromone) gart (IN = fromtwo) ;
by year gjob;
if fromone AND fromtwo then output inBOTH;
else if fromone and not fromtwo then output inCOP ;
else if not fromone and fromtwo then output inGART;
RUN;
and it yielded a "inCOP" dataset that contains the same set of observations
that were not merging properly. These unmatched observations DO have
by-value matches in the "inGART" dataset.
I think Curtis is on the right track. Proc import seems like a pretty crude
tool that can sometimes produce anomalies, especially when drawing from
formatted Excel files. I would feel a lot better about "cleaning" the data
in SAS, if at all possible; it's important that I can go straight from raw
data (i.e. the spreadsheet as is from the survey company) to finished report
with the "raw data" imported from Excel. Assuming there are hidden
characters of some kind what SAS functions might be useful for parsing out
what is observable? The by-variable data is in the form of year=#### and
gjob=##.### , where the pound represents a positive integer. Incidentally,
PROC IMPORT which automatically assigns a BEST12 format.
I checked the lengths and formats which all match, but I don't understand
what Ian meant by " If they are the same then try printing problem
combination in hex." Could you elaborate or perhaps even provide and
example. I'm convinced that this is a data problem, and not one with logic,
so I will forego trying to dumb down the data for simulation.
Thanks,
Mike
----------------------------------------------------------------------------
--
This message may contain confidential information, and is intended only for
the use of the individual(s) to whom it is addressed.
===========================================================================
==