An introduction of Hash function
Introduction
hash functions
for the collision-resolution process
-
collision: two keys may hash to the same slot
-
hashing with separate chaining (link structure)
-
linked list of the key-value pairs whose keys hash to that index
-
the basic idea is to choose m to be sufficiently large that the lists are sufficiently short to enable
efficient search through a two-step process
-
hashing with linear probing
-
open-addressing: store N key-value pairs in a hash table of size m>N
-
the idea is that rather than using memory space for references in linked lists, we use it for
the empty entries in the hash table, which mark the ends of probe sequences
hashtables vs other dictionary implementations
-
choosing a good capacity m
- We can choose the table size m to be sufficiently small that we do not waste a huge area of
contiguous memory with empty chains but sufficiently large that we donot waste time searching through
long chains.
-
hashtables are sometimes good because
- the keys do not have to come from an ordered type
- in practice we can set things up for constant-time performance
-
hashtables aren't always the best choice because
- you cannot easily read things out in a sorted order, assuming these was such an order
- you can't really get a good hash function that be can computed quickly, so they're usally
bad for small dictionaries
- resizing is very expensive, and probably can't be used in most real-time systems
reference
- Juraj Hromkovic - ch3 "Design and Analysis of Randomized Algorithms"
- Robert Sedgewick and Kevin Wayne - ch3.4, "Algorithms"
- hashtable
- hash function