|
On Thu, 21 Dec 2006 12:53:49 -0600, Dimitri Shvorob
<dimitri.shvorob@VANDERBILT.EDU> wrote:
>.. Is there a difference, as far as results are concerned? I used to
>think there wasn't: if one were to do a left join of A and B, WHERE
>would filter incoming rows of B before SAS attempted to match them to A
>rows, whereas ON would let B rows in, match them to those in A, and then
>apply the condition. In either case, rows of B failing the condition
>would not show up in the output table.
>
>The SQL calls below differ only in that the same (static!) condition,
>B.PERIOD = 2, is supplied first with ON, and then with WHERE, yet
>produce tables with very different sizes. (The ON version gives correct
>results).
>
>Could the PROC SQL cognoscenti of SAS-L please enlighten me what's
>happening?
>
>Thank you.
>
>
>PS. The examples are actual code, with a fragment (a tricky HAVING
>block) commented out; I had suspected that it was to blame, wrongly.
>
>
>
>
>(TEMP9 has 943 rows)
>
>proc sql;
> create table tempa
> as select a.*, b.analys, b.estdats as estdats_q3pre, b.value as est_q3pre
> from temp9 a left join temp0 b
> on b.ticker = a.ticker
> and b.analys = a.analys
> and b.fpedats = a.fpedats_q3
> and b.pdf = a.pdf
> and b.estdats < a.repdats_q0
> and b.period = 2 /*
> group by a.analys, a.fpedats_q0, a.pdf
> having b.estdats = max(b.estdats) */ ;
> quit;
>
>(TEMPA has 1103 rows)
>
>proc sql;
> create table tempb
> as select a.*, b.analys, b.estdats as estdats_q3pre, b.value as est_q3pre
> from temp9 a left join temp0 b
> on b.ticker = a.ticker
> and b.analys = a.analys
> and b.fpedats = a.fpedats_q3
> and b.pdf = a.pdf
> and b.estdats < a.repdats_q0
> where b.period = 2 /*
> group by a.analys, a.fpedats_q0, a.pdf
> having b.estdats = max(b.estdats) */;
> quit;
>
>(TEMPB has 231 rows)
Dmitri,
Check out this thread on the order of execution in outer joins ...
http://listserv.uga.edu/cgi-bin/wa?A2=ind0608D&L=sas-l&P=R30741&m=206294
HTH,
Ken
|