20 September 2021

This will kick your tail.

Hashing

A hash function is a function whose domain is some type of object and whose codomain is the integers. Within the lifetime of a given program, a hash function is a mathematical function. Generally, hash functions are defined in terms of object state. As a result, only immutible objects are hashable.

Hashability Hashing is a technique that allows for the rapid retrieval of data from sets and dictionaries. Hashing breaks a set into "buckets"; deterining the bucket to look in involves calling a hash function on an object. This is a quick operation. These buckets are maintained by an object called a hash table.

When the hash table becomes too crowded, it is enlarged and all objects are rehashed and placed into the newer larger table.

Here is a good way to think of a hash table. It is a giant array of memory, say of size M. When an object is to be inserted in this table, we mod by M to get result loc and place that object in a list that lives at entry loc. When the table crowds, the lists get too long and efficiency is lost. At that time, rehashing occurs.

Mutable objects are not hashable because hash functions depend on object state and the must be mathematical functions, and therefore be consistent.

Sets and Dictionaries

Today you will meet two new data structures, both of which are iterables. A set is an unordered data structure containing no duplicates. A dictionary is an associative array that maintains paired objects. Both are hashed data structures, in that they only admit the insertion of hashable objects.