Table of Contents

Dictionaries and hashing

A review over hashtables

Python dictionaries are hashtable implementations. In short, the instruction:

val = d[x]

will:

Good dispersion functions:

Efficient dispersion functions:

Dictionaries

We often use integers and strings as keys, when relying on dictionaries in Python. These types are immutable, which means they do not change during the execution of a program. As a consequence:

If we use mutable objects (e.g. lists) as keys in Python, we get a type-error: TypeError: unhashable type: 'list'. The following example shows why mutable objects cannot be used as keys:

l = []
 
d = {}
d[l] = 1   # the pair (l,1) is stored in the bucket hash(l)
 
l.append(0)
 
print(d[l]) # we use hash(l) to obtain the value assigned to l, but l was changed? 
            # how would the hash function work to always compute the same bucket in which l is assigned?

Sets, as well as other datatypes who's value can change, cannot be used as keys in Python. An option is to use, if necessary, frozenset, which is the immutable alternative to sets in Python.

Writing your own hash function

So far, we have not pointed out who the hash function actually is. For predefined datatypes, Python implements efficient hash functions. However, we can also define hash functions for objects that we create:

class O:
  def __init__(self,x):
      self.x = x
  # this special function is the hashing function implemented for objects of type ''O''. Now we can use them as keys in a dictionary
  def __hash__(self):
      return 0 # this implementation is not really efficient, but it simply illustrates the syntax for defining a hash-function of your own
 
d = {}
ob = O(45)
d[ob] = "hello"
d[O(1)] = "kitty"  # the pairs   O(45),"hello"   and   O(1),"kitty"  will share the same bucket (namely 0)