Python extras

These extra sections are slightly more advanced python topics which will be useful for the project.

Python 3.12 introduces a more intuitive syntax for generics than the one presented in Programming introduction - Python. The new syntax is closer to generics in other languages such as Java or C++. The following code (in old syntax):

T = typeVar('T')
U = typeVar('U')
class Example(Generic[T]):
    def exampleMethod(param1 : T, param2: U) -> U:
        ...
    ...

becomes, in python3.12:

class Example[T]:
    def exampleMethod[U](param1 : T, param2: U) -> U:
        ...
    ...

F-Strings (or formatted strings) are a shorthand way of writing formatted strings in python, similar to sprintf in C. The synatx is easy to read and write, without looking too different from regular strings (run the following code in a python interpreter to see the results):

x = 5
y = 5.353
l = [1,2,3,4]
d = {1:'a', 2:'b'}
s = 'abc'
 
print(f'This is a simple f-string with no interpolated values')
print(f'To interpolate a value simply add it in curly brackets, as such: {x} <(when evaluated, this should be 5 instead of x)')
print(f'You can use as many variables, and even more complex types, or even expressions: {x}, {y}, {l}, {d}, {len(d)}')
print(f'For debugging purposes, we may often print things like "x=<value of x>". Using f-strings, this is also quite straightforward, just add an equal sign after the variable name: {x=}, {y=}, {l=}')
print(f'You can also give some more formatting options, such as: precision for floating values: {y:.2f}, width for strings: {s:5}, hex formatting for ints: {x:#0x}')

Official documentation: https://docs.python.org/3/reference/lexical_analysis.html#f-strings

Dataclasses are a shorthand way of implementing certain methods and generally removing some boilerplate code.

Assume you're writing a class to implement linked lists, which can be printed in a similar way to ADT's (so [1,2,3] should be printed as Cons(1,Cons(2,Cons(3,Empty())))). Your code may look like this:

class List:
    pass
 
class Empty(List):
    def __str__(self):
        return 'Empty()'
 
    def __repr__(self):
        return str(self)
 
class Cons(List):
    def __init__(self, head, tail):
        self.head = head
        self.tail = tail
 
    def __str__(self):
        return f'Cons({self.head},{self.tail})'
 
    def __repr__(self):
        return str(self)

contrast this to how a similar level of functionality would be implemented in Haskell:

data List a = Cons a (List a) | Empty deriving Show

Is it possible to eliminate most of the boilerplate code present in the python version while keeping most of the functionality? YES, using dataclasses.

dataclass is a type of python object known as a decorator. In general, decorators are placed right before a class or function definition (preappended with the @ symbol) and alter their behaviour in certain ways. In our case, the dataclass decorator implements some generic methods of the class, most importantly, __init__, __str__ and __repr__. If we want our class to contain member variables, we only need to list them in the function body. The list code rewritten with dataclasses looks like this:

from dataclasses import dataclass
 
class List:
   pass
 
@dataclass
class Empty(List):
    pass  # will automatically implement a str and repr method
 
@dataclass
class Cons(List):
    head: any
    tail: List

Now, the downside is that we lose some control over the specific implementations of the autogenerated methods. For example, the default behaviour for the __str__ and __repr__ methods also adds the name of the attribute in the representation (for example, we get Cons(head=1, tail=Empty()) instead of Cons(1,Empty)).

There are options to disable certain methods from being generated or add extra ones, by giving arguments to the dataclass decorator. For this, and further reading on this topic, please refer to the official documentation for dataclasses.

As part of your project, i t is very likely that you will attempt to use a set or list either as an element of another set, or as a part of the key of a dictionary. However, if you actually try this, you will get an error stating that your element is not hashable. Why is this?

In short, both dictionaries and sets are implemented as hashmaps in the python library (for quick search and append of unique elements). This means that elements should have a hash method implemented (alongside a method to check equality between objects) and, moreover, this hash fuction should always be consistent with the equality comparison (objects which are equal should always have the same hash, but the other way around is not necessary).

Additionally, this hash is only calculated when the element is first added in the collection, so it should stay constant during the object's lifetime. As such, any modification to the object which impacts its equality with oher objects should also be fobidden.

All of this essentially boils down to: mutable collections (such as lists, sets and dictionaries) can NOT be used as set elements or dictionary keys. However, the use of a collection in such a situation may still be desirable in some situations (such as the subset constructon algorithm).

To work around this restriction we can use IMMUTABLE COLLECTIONS (collections which can no longer have elements added/removed/replaced), such as tuples (immutable lists, their elements may or may not be hashable, and the resulting tuple is hashable if all elements are hashable) and frozensets (immutable sets, their elements should still be hashable, as is the case of sets, and so frozensets are always hashable).

Converting from a mutable to immutable collection is quite easy (you only apply the constructor over the original collection):

l = [1,2,3]  # list
t = (1,2,3)  # tuple
s = {1,2,3}  # set
fs = frozenset({1,2,3})  # frozenset
 
tuple(l) # -> the tuple (1,2,3)
list(t) # -> the list [1,2,3]
set(l) # the set {1,2,3}
set(fs)
frozenset(s)
frozenset(l)
 
# etc.

Match statements are a way of performing structural matches (similar to the pattern matches in functional languages such as Haskell of Scala). They are able to differentiate the type of the object and match its attributes to a given list. Match statements can also be used on lists.

The syntax is as follows:

match <expression to match>:
   case <Class to match to>: <code to execute on this case>
   case <Class to match to>(<members to check>=<value> [...] (optional)): <code to execute on this case>
   case <Class to match to>(<members> = <label> [...] (optional)): <code to execute on this case (label will be available as a variable)>
 
   # for lists
   case []: <code to execute on empty list>
   case [x,y]: <code to execute on list with exactly 2 elements, which will be labeled x and y>
   case [x,*xs]: <code to execute on a list with at least one element, the first of which will be labeled x, and a list conatining the rest of the elements will be labeled xs>

The case statements will be evaluated in order, stopping when the first match occurs.

The following example covers most features of the match statement:

from dataclasses import dataclass
 
 
@dataclass
class A:
    x: int
    y: int = 5
 
 
class B:
    def __init__(self, x):
        self.x = x
        self.z = 5
 
    def set_z(self, z):
        self.z = z
 
 
class C(B):
    def __init__(self, x, y, z):
        super().__init__(x)
        self.y = y
        self.z = z
 
 
def match_example(obj):
    match obj:
        case []:
            return f'empty list'
        case [x, y, xs]:
            return f'list starting with {x}, then {y}, with the rest being {xs}'
        case [x, *_]:
            return f'list starting with {x}'
        case A(y=name_for_y):
            return f'An A with x = {obj.x}, y = {name_for_y}'
        case B(z=5):
            return f'Default B with x = {obj.x}'
        case B(z=modified_z, x=5):
            return f'B with x = 5 and z modified to {modified_z}'
        case C():
            return f'C with y = {obj.y}'
        case B(z=modified_z):
            return f'this case will not be reached in examples'
 
 
if __name__ == '__main__':
    objects = [
        [], # will print 'empty list'
        [1, 2, 3], # will print 'list starting with 1, then 2, with the rest being [3]'
        [1], # will print 'list starting with 1'
        A(4), # will print 'An A with x = 4, y = 5'
        B(5), # will print 'Default B with x = 5'
        C(5, 6, 7), # will print 'B with x = 5 and z modified to 7'  (the 'B with modified z' case is matched before reaching the case with C)
        C(6, 6, 7) # will print 'C with y = 6'
    ]
    for o in objects:
        print(match_example(o))