Edit this page Backlinks This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Dictionaries and hashing ====== ===== A review over hashtables ===== Python dictionaries are [[https://en.wikipedia.org/wiki/Hash_table|hashtable]] implementations. In short, the instruction: <code python> val = d[x] </code> will: * apply the a function $math[hash] on object ''x'', to obtain a //bucket// $math[b], ($math[hash(x)=b]) * buckets are just collections of key-value pairs (often implemented as arrays). Next we search (in linear time) for key ''x'' in bucket $math[b], and return it's corresponding value Good dispersion functions: * If the function $math[hash] is a **good** dispersion function, then its range (co-domain) will be large, which means that we have a lot of buckets with very few pairs inside them. * A very **bad** dispersion function is the identity function, which performs exactly like an array (a single bucket - with all pairs inside it) * The **best possible** dispersion function would have one bucket per pair Efficient dispersion functions: * An efficient dispersion function is easily computable (finding the proper bucket value is fast) ===== Dictionaries ===== We often use integers and strings as **keys**, when relying on dictionaries in Python. These types are **immutable**, which means they **do not change during the execution of a program**. As a consequence: * **we can use a hash function to always get the same bucket value**. If we use **mutable** objects (e.g. lists) as keys in Python, we get a type-error: ''TypeError: unhashable type: 'list'''. The following example shows why mutable objects cannot be used as keys: <code python> l = [] d = {} d[l] = 1 # the pair (l,1) is stored in the bucket hash(l) l.append(0) print(d[l]) # we use hash(l) to obtain the value assigned to l, but l was changed? # how would the hash function work to always compute the same bucket in which l is assigned? </code>