LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2000, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 26 Oct 2000 01:53:16 -0700
Reply-To:     kmself@IX.NETCOM.COM
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         kmself@IX.NETCOM.COM
Subject:      Re: data step merge --> sql join
In-Reply-To:  <39F7BA2A.29011A81@ix.netcom.com>; from julierog@IX.NETCOM.COM on
              Thu, Oct 26, 2000 at 05:00:35AM +0000
Content-Type: multipart/signed; micalg=pgp-sha1;
protocol="application/pgp-signature";

on Thu, Oct 26, 2000 at 05:00:35AM +0000, Roger Lustig (julierog@IX.NETCOM.COM) wrote: > Hi, Karsten! > > Not quite. The DATA step is doing a left join, not an inner join. > (See annotation in DATA step below.) Also, the match-type counters > require a full join, so a full join with a WHERE= in the > data set options for set_c might come a little closer. Even then, > doing a PUT of record counts to the log would probably require a > separate query, one with the INTO :<macro_var> feature.

Thanks. I've not been using SAS for about a year, still playing with SQL elsewhere. Getting rusty. Thought I'd post a non-rant for a change (but don't get used to it <g>). Many people aren't aware of the CASE/WHEN/ELSE/END construct in PROC SQL (and surprisingly many don't use SQL...). Yet another reason not to take SAS-L responses quite at face value. Surprised someone (named Sig?) didn't step in on this one.

Good catch on the left v. full join, I miss that frequently until I go about reconciling records -- data step is rather better at accounting than SQL.

My rule of thumb is that first. and last. processing are often kludges around something else. Frequently they're proxies for GROUP BY coupled with a SELECT DISTINCT, to summarize data. Looks as if someone's trying to do some funkiness with the "var2_a" bit. I might mop up that and a few other details with a data step view before or after the SQL, if they were truly necessary.

Combined effects (X1, X2) can be used by computing an intermediate value, though I'd have to check to see whether CASE/WHEN/ELSE supports the CALCULATED keyword.

> This does a little more of the job: > > proc sql; > CREATE TABLE set_c AS > SELECT > * > , CASE var1_a > WHEN 0 THEN var1_b > WHEN 1 THEN var2_b > WHEN 2 THEN var3_b > ELSE . > END /* <--------------------- gotta have one of these! > */ > AS newvar > /* etc. */ > FROM > set_a > LEFT JOIN > set_b > ON set_a.byvar1 eq set_b.byvar1 > and > set_a.byvar2 eq set_b.byvar2 > > ; > quit; > > But that /* etc. */ is the tough part! > > There's no explicit way to do FIRST.<var> and LAST.<var> processing > in a SQL query, because tables are (conceptually) unordered. The only > solution is to make an identifier variable for the record that will be > "first.". > > Also, unless there's a variable in each input table that's never > missing, > it's hard to test for the presence of one or the other input table > in a particular row of the joined table (the full-join bit that > increments > the counters). > > This sort of join-plus-other-stuff is a fine example of the flexibility > of the DATA step with the MERGE statement, if you ask me...when you > learn > SQL, you get those cool examples of one join doing the same thing as > five DATA steps, four PROC SORTs, and two PROC SUMMARYs. To do all the > things Michael accomplishes in his DATA step, you need two DATA steps > to set up the identifiers for FIRSTness, then a left join to make the > data > set, then a full join to get the counts, possibly some macro stuff to > get > the counts into and out of macro variables and into the log, etc. > > Unless I'm missing something really obvious... > > Roger > > kmself@IX.NETCOM.COM wrote: > > > > on Mon, Oct 23, 2000 at 01:21:15PM -0400, Schulingkamp, Michael E (michael.e.schulingkamp@LMCO.COM) wrote: > > > I'm looking for some help with translating a data step merge into an sql > > > join. I'm not too familiar with proc sql but I've taken a stab at putting a > > > shell together. I just need help filling in the pieces. Here is a > > > simplified version of the code that I'm working with... > > > > > > set_a set_b > > > -------- -------- > > > byvar1 byvar1 > > > byvar2 byvar2 > > > var1_a var1_b > > > var2_a var2_b > > > var3_b > > > > > > data set_c ; > > > merge set_a (in=x1) set_b (in=x2) end=last ; > > > by byvar1 byvar2 ; > > > > > > if (var1_a=0) then newvar = var1_b ; > > > else if (var1_a=1) then newvar = var2_b ; > > > else if (var1_a=2) then newvar = var3_b ; > > > > > > if (x1=1 & x2=1) then match+1 ; /* < --- This part is full-joinish! */ > > > else if (x1=1 & x2=0) then in_a+1 ; > > > else if (x1=0 & x2=1) then in_b+1 ; > > > else if (x1=0 & x2=0) then nomatch+1 ; /* hopefully this never happens! */ > > > > > > if (first.byvar2=1) then var2_a = var2_a ; > > > else var2_a = . ; > > > > > > if (x1=1) then output set_c ; /* <----------------Left join! */ > > > > > > if (last=1) then put match= nomatch= in_a= in_b= ; > > > run; > > > > proc sql; > > create table c as > > select > > * > > , case var1_a > > when 0 then var1_b > > when 1 then var2_b > > when 3 then var3_b > > else . > > as newvar > > /* etc. */ > > from set_a, set_b > > where set_a.byvar eq set_b.byvar > > ; > > quit; > > > > -- > > Karsten M. Self <kmself@ix.netcom.com> http://www.netcom.com/~kmself > > Evangelist, Opensales, Inc. http://www.opensales.org > > What part of "Gestalt" don't you understand? There is no K5 cabal > > http://gestalt-system.sourceforge.net/ http://www.kuro5hin.org > > GPG fingerprint: F932 8B25 5FDD 2528 D595 DC61 3847 889F 55F2 B9B0 > > > > ------------------------------------------------------------------------ > > Part 1.2Type: application/pgp-signature

-- Karsten M. Self <kmself@ix.netcom.com> http://www.netcom.com/~kmself Evangelist, Opensales, Inc. http://www.opensales.org What part of "Gestalt" don't you understand? There is no K5 cabal http://gestalt-system.sourceforge.net/ http://www.kuro5hin.org GPG fingerprint: F932 8B25 5FDD 2528 D595 DC61 3847 889F 55F2 B9B0


[application/pgp-signature]


Back to: Top of message | Previous page | Main SAS-L page