Date: Tue, 20 Oct 2009 13:01:20 -0700
Reply-To: "Richard A. DeVenezia" <rdevenezia@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Richard A. DeVenezia" <rdevenezia@GMAIL.COM>
Organization: http://groups.google.com
Subject: Re: can has hash manys?
Content-Type: text/plain; charset=ISO-8859-1
On Oct 18, 9:39 pm, z <gzuck...@snail-mail.net> wrote:
> I'm new to hash functions, so maybe I'm missing something.
> I have a big matching: a table of 90 million transactions and a table
> of 1,000 members. Each member in the little table has at least one
> transaction; likely, more than one. The task, of course, is to pull
> out all transactions related to the thousand members, matching on
> member_id. I'm testing it just pulling the transaction date. Simple
> enough, but the trouble is that it runs over 24 hours on our system
> without reaching an endpoint, even using an index on the big table.
Do you really want a many-to-many match ?
Suppose some member X has
- in LITTLE 10 different dates over 40 rows.
- in TRANSACT 100 rows
Should member X in the resultant join have 1,000 rows (the 100 rows of
transact matched with each of the 10 dates) ?
You are getting only one date per member id because hash dataset:
option tacitly ignores repeated key entries found in the source
table. You will only get a hash entry for each first distinct key
row.
--
Richard A. DeVenezia
http://www.devenezia.com
|