HashTable

Stores data in an Associative Manner

It is a Key Value Store

  • Key is Unique
  • Key is mapped to an Index: Hashing is used to convert a range of key values into a range of indexes of an array

Hashing

Derives a fixed size Result from Input

Examples of Hash Functions

  • Cryptographic hash function
    • MD5 "MD" stands for "Message Digest"
    • SHA-2 Secure Hash Algorithm 2
  • Zobrist hashing used in computer programs that play abstract board games, such as chess and Go

Collision Resolution

Find some other place to put the object that is being inserted into the hash table because the actual location was already taken

Folling are few Collision Resolution Stratagies

Linear probing

  1. When a new item is inserted into the hash table, use the hash function to determine where in the table it belongs.
  2. Check to see if an element already exists in that spot in the table. If the spot is empty, place the element there and return, otherwise go to step 3.
  3. If the location the hash function pointed to was location i, simply check location i + 1 to see if that is available. If it is also taken, check i + 2, and so on, until an open spot is found.

Quardratic probing

Rehasing OR Double hashing

Used by HashTable

There is a set of hash different functions, H1 ... Hn, and when inserting or retrieving an item from the hash table, initially the H1 hash function is used. If this leads to a collision, H2 is tried instead, and onwards up to Hn if needed.

These hash functions only differentiating by a multiplicative factor

In general, the hash function Hk is defined as:

H^k(key) = [GetHash(key) + k * (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1)))] % hashsize

Chaining

Used by .Net Dictionary

A secondary data structure(Linked List) is utilized to hold any collisions.

Each slot in the Dictionary has an array of elements that map to that bucket. In the event of a collision, the colliding element is prepended to the bucket's list.

Load Factors and Expanding the Hashtable

Also sometemes reffered to as Fill Factor

It specifies the maximum ratio of items in the Hashtable to the total number of slots.

Value Range: 0.1 and 1.0

Whatever value you provide, it is scaled down 72%

So even if you pass in a value of 1.0 the Hashtable class's actual loadFactor will be 0.72. The Microsoft team found through empirical testing that values greater than 0.72 seriously degraded the performance. They decided that the developer using the Hashtable would be better off if they didn't have to remember a seeming arbitrary value in 0.72, but instead just had to remember that a value of 1.0 gave the best results. So this decision, essentially, sacrifices functionality a bit, but makes the data structure easier to use and will cause fewer headaches in the developer community.

When a new item is added to the HashTable

  1. Check if ratio of Items to Slot is Greater then Load Factor
  2. If Yes: Expand the HashTable