====== Code Quality check ====== The project will be evaluated for code quality using a set of automated tools, but additional inspection may be done during the presentation by your TA. The code quality features we will be testing can be divided in two categories: bad practices and improvement suggestions. ===== Error-prone practices ===== This category encapsulates code which is very vulnerable to errors, even if, in the case of your submission, the code behaves as expected. The presence of too may of these will lead to your submission being penalised with up to 0.1 points per stage. (You will still be able to obtain the rest of 0.9 point on functionality alone). The following are the types of errors which WILL be flagged from this category: ==== General errors ==== These are bits of code which will lead to unexpected behaviour or crashes, but, due to the interpreted nature of the python language, may not be detected when running the code normally (as that piece of code may not be executed). This includes: x = a + 1 # a is not yet defined a = x + 1 class A: def __init__(self, rarely_used_parameter=False): if rarely_used_parameter: self.m += 1 # the member 'm' is not yet initialised, but if the testing happens to never set the 'rarely_used_parameter' to True, this error may go undetected self.m = 0 def inc(self): self.m += self.n # A does not have a member named 'n'. def init_x(self): self.x = 5 # any member variables should be defined in the init method def use_x(self): self.x += 1 # this will cause an error if init_x is not called beforehand This includes trying to import inexistent modules or importing objects which do not exist in the given module. Another issue which, depending on the circumstance, may cause problems, is cyclic imports (two or more modules import eachother in a cyclic manner, meaning that a given module ends up depending on itself). The first type of errors may go undetected if imports are placed inside functions, classes or control blocks. To avoid this, make sure all of your imports are placed at the beggining of your file, outside of any control blocks. It is an error to define a function with the same name multiple times in a single module (exception being inner functions, i.e. functions defined inside other functions) and, likewise, to define multiple methods with the same name inside the same class (overriding an inherited method is exempt from this). Another possible error may be defining a member of a class with the same name as one of its methods. This will make the method uncallable (any reference to that name will point to the member instead of the method) ==== Import and variable redefinitions (shadow-ing) ==== This covers cases where an already-defined object is redefined in an erronous way. If the creation of a new object is the desired behaviour, then consider renaming it to avoid conflicts: Defining a variable in an inner scope (inside a function or method) will cause any variable (including imports) with the same name to be temporarly unaccessible. This usually is done by accident, but if this is the intended behaviour, you should consider renaming the variable to avoid confusion. import re matches = set() def split_digits(input_string): re = '(0|1|2|3|4|5|6|7|8|9)*' # the 're' variable defined here 'shadows' or hides the name of the module re. any references to the name 're' within this function will refer to this string split_str = re.split(regex, input_string) # this calls the string split method, rather than the split function of the re module, which in this case is not the intended behaviour (in this case an error will occur because str.split expects the second argument to be an integer) matches = matches.union(split_str) # this does NOT update the global variable, it only creates a temporary object return split_str def process_matches(matches): # parameter names can also shadow outer variables or imports for m in matches: m = get_some_value() # this redefines the loop variable, which may or may not be the desired outcome ... ==== Excessive, unused or misplaced imports ==== The imports should be placed at module-level, at the beggining of the file (not inside function or class definitions or within control blocks) and should ideally import either the module as an object (optionally renamed) or only the required objects from that module. from time import * # star imports are generally to be avoided. only import what is needed. for example, 'form time import perf_counter' form typing import Generic # unused import, consider deleting this from numpy import * # if you use many different things from a module, listing all imports may not be feasable. in this case, consider importing the module as an object rather than individual labels from it. (qualified import): 'import numpy' or 'import numpy as np' import pandas as pandas # useless renaming using 'as' def time_something(f, x): start = perf_counter() ... # do some pandas/numpy processing on x, generating x_processed f(x_processed) end = perf_counter() return end - start ==== Global variables ==== While global variables may not cause issues in isolated cases, when producing code which is meant to be used by other codebases, like the case of our project, the use of global variables may cause issues in multithreaded or multiprocess applications. As such, the use of such variables is strongly discouraged. Exceptions to these cases are global constant definitions and type variables and aliases. PI = 3.1416 # constants are a good use of global variables T = typeVar('T') # typeVars will almost always need to be global def GenericClass(Generic[T]): pass global_stack = [] def exported_function(param): global global_stack global_stack = [] fill_global_stack_with_possible_solutions(param) for sol in global_stack: check_solution(sol, param) # if exported_function is called by 2 or more concurrent threads, they may interfere with each other, and there is a strong possibility that it will be called like that ==== Redundant and trivially simplifiable code ==== This includes: * unreachable code, * expressions with no effect * unused variables * redundant code and expressions written in an overcomplicated manner (using extra unneeded statements or harder-to-read versions of statements, such as dunders instead of builtins). * use of improper iteration methods (for example, iterating over the range of indices and then accessing a list directly in the loop, instead of iterating over the list) def f(): x = 1 # this value is never used 1+2 # this statement has no effect y = 1 y = y # this statement is redundant return True y += 1 # this code is unreachable def g(z): # z is not used if 1 < 2: # two errors: constant comparison (the condition always evaluates to True, so it can be safely replaced with True), and constant if condition (the 'then' branch is always executed, so the if can be removed) # the following if can be easily simplified to 'return f()' or 'return bool(f())' if the result is not guaranteed to be a bool but is interpreted as such if f(): return True else: return False h = lambda x: x + 1 # use a more easily readable function def t = (lambda x: x * 2)(4) # do not define a lambda just to immediatly call it, use the expression 4 * 2 directly (or, in this case, just write 8) l = [1,2,3] if not not 1 in l: # double not is redundant print(not 1 < 2) # replace not less than with >= s = {x for x in l} # do not use comprehensions, just use the built-in set() constructor for i in range(len(l)): print(l[i]) # do not iterate over range, then index the list, just iterate over the list (use enumerate if you also need the index) def reurns_nothing(): print("imagine this changed some state") return # this return is not needed while True: if some_condition(): # use 'while some_condition():' instead break print("do something") def my_min(ls): # do not do this, python already has min, max, any, all, sum implemented (along with many others) m = ls[0] for x in ls[1:]: if x < m: m = x return x class A: pass a = A() if a.__class__ == A: # use isinstance or match statements print(a.__str__()) # use str(a) ==== Complexity ==== Your functions and methods should be modular and it should be easy to understand their control flow. We will look at the following metrics: statements per function, number of branches and loops (cyclomatic complexity). Another side of this is that duplicate or very similar sequences of code should be refactored into functions or methods if possible. ==== Return consistency ==== All return statements of a function should return an exppression or none of them should def f(): if some_condition(): return True elif some_other_condition(): return # because the other return yields a value (True), this should explicitly return a value as well # if both conditions are false, an implicit bare return exists here, which should be replaced by a return with a value ==== Dangerous default values ==== You should be careful when using default values for function/method parameters. Object values (such as list/set/dict literals) will be reused between calls and may lead to unexpected behaviour def f(l = []) l.append(1) print(l) f() # prints '1' f() # prints '1 1' f() # prints '1 1 1' def f(l = None) if l is None: l = [] l.append(1) print(l) f() # prints '1' f() # prints '1' f() # prints '1' ===== Improvement suggestions ===== These are cases in which your code is readable and without major performance issues, but could be improved by using more specific or clear syntax. Having very few of these suggestions left unimplemented will grant you a bonus of up to 0.1 points per stage (meaning you will be able to reach 1.1 points on each stage). This bonus is only granted if no penalty is given from error-prone practices. ==== Comprehensions ==== You can use comprehensions instead of for loops, such as the ones described [[lfa:2023:lab01#list_comprehensions|here]] to more succintly perform operations resulting in lists, sets or dictionaries by iterating on other objects, especially maps, filters, cartesian products, and flattens. L = [1,2,3,4] S = { ['a', 'b'], ['c', 'd'] } D = {'a':1, 'b': 2, 'c':3} # maps double_l = [] for x in L: l1.append(x * 2) reverse_d = {} for k,v in D.items(): reverse_d[v] = k # filter even_l = [] for x in L: if x % 2 == 0: even_l.append(x) # cartesian product all_pairs = [] for x in L: for y in S: all_pairs.append((x,y)) # inner object iteration (flattening) flat_s = set() for l in S: for x in l: flat_s.append(x) # combined operations comb = [] for x in L: if x % 2 == 0: for l in S: for y in l: comb.append(y * x) L = [1,2,3,4] S = { ['a', 'b'], ['c', 'd'] } D = {'a':1, 'b': 2, 'c':3} double_l = [x * 2 for x in L] reverse_d = {v:k for k,v in D.items()} even_l = [ x for x in L if x % 2 == 0 ] all_pairs = [(x, y) for x in L for y in S ] flat_s = { x for l in S for x in l } comb = [ x * y for x in L if x % 2 == 0 for l in S for y in l ] ==== Swapping variables ==== In python you can use ''a, b = b, a'' to swap the values of 2 variables a and b instead of using temporary variables. ==== Type hints ==== Type hints can help others (including yourselves if you come back to the code after a small break) better understand your code, and allows software tools give you meaningful feedback (autocomplete/intellisense, live type errors and suggestions). It is not necessary to give hints to every single variable you define, but it is recommended to use type hints on functions, methods (both parameters and return types) and class members, and use generics whenever applicable. ===== Example code ===== Here are some small examples of possible code submissions for a very simple task, one which is penalised, one which recieves regular points and one which also recieves a bonus. TODO: decide task and give example code ==== Penalised code ====