Date: Tue, 9 Apr 2002 14:28:42 -0700
Reply-To: Cassell.David@EPAMAIL.EPA.GOV
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV>
Subject: Re: Key-indexing, hash, and MORE (was: a interesting question).
Content-type: text/plain; charset=us-ascii
"Dorfman, Paul" <Paul.Dorfman@BCBSFL.COM> added [in part]:
> Combining a line of information from a good birdie with the results I
have
> seen so far leads me to believe that AA operates in the following
manner:
>
> 1) AA declaration establishes table structure taking into account the
> combined key/data length and number of buckets (internal, depends on
HASHEXP
> specification)
Good. Being able to provide some hint to the compiler should help
when you know you want to set up a massive hash.
> 2) Insertion:
> Hash the key with the result pointing to a bucket organised as an AVL
tree
> Ascend the bucket to memory
> Search the tree
> If found, set RC>0 for AA.ADD and terminate
> Insert the key
> Rebalance the tree if needed
> Descend the bucket
Yep. A hash is fast to search, but is typically O(N) to construct.
On the other hand, an array is fast to build [ O(1) when you simply
put the i_th value in array slot i or anything analogous], but
comparatively slow to search [ O(N) for a linear search, since on
average you have to look at half the values to find what you want].
> > or does it expand the associative array as you add elements?
>
> I think that the number of buckets is fixed by HASHEXP, but they are
allowed
> to overflow (equivalent of having the load factor > 1).
Well, that shouldn't be a problem. Still, a good algorithm should
expand the hash beyond a certain point, to minimize searching time.
> To what they already know, I should add that I do not like the formal
term
> 'associative array'. Just
> like in Perl, folks will call it simply hash.
Amen, brother! Okay, that's two of us for 'hash'.. and all SAS
programmers for 'associative array'... Do we have them outnumbered
yet? :-)
David
--
David Cassell, CSC
Cassell.David@epa.gov
Senior computing specialist
mathematical statistician