Date: Fri, 5 Sep 2008 09:59:19 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: Hash lookup not quicker than SQL inner join?
In-Reply-To: <002601c90f2c$3f1ca7b0$bd55f710$@net.nz>
Content-Type: text/plain; charset="us-ascii"
Robin:
What you are saying doesn't surprise me. I cited my experience with left joins and summaries that do not "optimize". A hash object implementation of the left join works much faster. I'd certainly choose the hashing method for a series of transitive joins. (For those unfamiliar with the method, see http://www.nesug.org/proceedings/nesug06/dm/da07.pdf for details of the programming.) I would expect the SAS SQL compiler to have access to the same implementation. I've asked Paul Kent about the possibility of upgrading the SAS SQL query optimizer with this left join implementation, but haven't heard more about it.
S
-----Original Message-----
From: Robin Templer [mailto:templerr@clear.net.nz]
Sent: Friday, September 05, 2008 3:52 AM
To: Sigurd Hermansen; SAS-L@LISTSERV.UGA.EDU
Subject: RE: Hash lookup not quicker than SQL inner join?
Interesting.
We have just started a project to investigate how to speed up some of SQL code that is being used to build some large tables. First off the rank was one sql step join 29 dimension tables to the fact table using left joins which took 180 minutes, Converted to a Datastep hash object and it runs in 20 minutes.
SO our experience is contra to what is being said - but we are still investigating.
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Sigurd Hermansen
Sent: Thursday, 4 September 2008 2:56 a.m.
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Hash lookup not quicker than SQL inner join?
Sander:
Yes, a SAS SQL inner join will often when appropriate create a hash index in the background and use it to select specific key values. A SQL compiler develops an execution plan that improves where possible on brute force searches, sorting, etc. Unfortunately SAS SQL does not always find better solutions, as I have found with left joins and with summaries. S
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Sander Burggraaff
Sent: Wednesday, September 03, 2008 6:16 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Hash lookup not quicker than SQL inner join?
On 2 sep, 20:09, jay.so...@GMAIL.COM (Jay Soule) wrote:
> On Tue, Sep 2, 2008 at 9:00 AM, Bucher Scott <SBuc...@schools.nyc.gov>
> wrote:
> > You could try the "PROC SQL _METHOD;" or "PROC SQL _TREE;" options
> > to get more details on how SQL is actually performing the joins. In
> > some cases, SQL actually performs a hash join. I believe there are
> > one or two conference papers on this topic and the aforementioned
> > options.
>
> Right - if the note in the SAS log after the SQL step shows that the
> method used was "sqlxjhsh" then SAS used a hash join.
>
> - Jay Soule
Oh yes, that's right! I hadn't thought of that and I should have remembered... OK, thanks a lot!
Sander