Python dictionaries are hashtable implementations. In short, the instruction:
val = d[x]
will:
x
, to obtain a bucket $ b$ , ($ hash(x)=b$ )x
in bucket $ b$ , and return it's corresponding valueGood dispersion functions:
Efficient dispersion functions:
We often use integers and strings as keys, when relying on dictionaries in Python. These types are immutable, which means they do not change during the execution of a program. As a consequence:
If we use mutable objects (e.g. lists) as keys in Python, we get a type-error: TypeError: unhashable type: 'list
'. The following example shows why mutable objects cannot be used as keys:
l = [] d = {} d[l] = 1 # the pair (l,1) is stored in the bucket hash(l) l.append(0) print(d[l]) # we use hash(l) to obtain the value assigned to l, but l was changed? # how would the hash function work to always compute the same bucket in which l is assigned?
Sets, as well as other datatypes who's value can change, cannot be used as keys in Python. An option is to use, if necessary, frozenset
, which is the immutable alternative to sets in Python.
So far, we have not pointed out who the hash function actually is. For predefined datatypes, Python implements efficient hash functions. However, we can also define hash functions for objects that we create:
class O: def __init__(self,x): self.x = x # this special function is the hashing function implemented for objects of type ''O''. Now we can use them as keys in a dictionary def __hash__(self): return 0 # this implementation is not really efficient, but it simply illustrates the syntax for defining a hash-function of your own d = {} ob = O(45) d[ob] = "hello" d[O(1)] = "kitty" # the pairs O(45),"hello" and O(1),"kitty" will share the same bucket (namely 0)