This is an old revision of the document!
The project will be evaluated for code quality using a set of automated tools, but additional inspection may be done during the presentation by your TA.
The code quality features we will be testing can be divided in two categories: bad practices and improvement suggestions.
This category encapsulates code which is very vulnerable to errors, even if, in the case of your submission, the code behaves as expected.
The presence of too may of these will lead to your submission being penalised with up to 0.1 points per stage.
(You will still be able to obtain the rest of 0.9 point on functionality alone).
The following are the types of errors which WILL be flagged from this category:
Using a variable before its definition
This covers all cases of accessing a variable before a value is assigned to it for the first time. Due to the lack of a compile-time type check, such errors may slip in undetected in the code.
- example.py
x = a + 1 # a is not yet defined
a = x + 1
class A:
def __init__(self, rarely_used_parameter=False):
if rarely_used_parameter:
self.m += 1 # the member 'm' is not yet initialised, but if the testing happens to never set the 'rarely_used_parameter' to True, this error may go undetected
self.m = 0
def inc(self):
self.m += self.n # A does not have a member named 'n'.
Import, variable, member or method redefinitions and shadow-ing
This covers cases where an already-defined object is redefined in an erronous way:
import and variable shadowing and redefinitions
import and variable shadowing and redefinitions
import re as regex
matches = set()
def is_digits(input_string):
regex = '(0|1|2|3|4|5|6|7|8|9)*' # the 'regex' variable defined here 'shadows' or hides the name of the module re
match = regex.fullmatch(regex, input_string) # this should thow an error, but if the 'regex' variable happened to thave the fullmatch method, this could slip in undetected
if match is not None:
matches = matches.union({match}) # this does NOT update the global variable
return True
return False
def process_matches(matches): # the parameter name shadows the global variable
for m in matches:
m = get_some_value() # this redefines the loop variable
...
function or method redefinitions and shadowing
function or method redefinitions and shadowing
def f():
print(1)
def g():
x = 2
def f(): # this shadows the global function 'f'
print(x)
f()
def f(): # this is an error
print(3)
class A:
def g(self):
print(6)
def __init__(self):
self.g = 5 # this attribute hides the method g
def f(self): # this is ok, as it is a method so it will never be confused with the outer 'f'
print(4)
def f(self):
print(5) # this is an error
Import errors
The imports should be placed at module-level, at the beggining of the file (not inside function or class definitions or within control blocks), should ideally import either the module as an object (optionally renamed) or only the required objects from that module, should not cause import cycles, and should not contain additional errors
- incorrect_imports.py
import nonexistent_module # module does not exist, remove this or replace with correct module
from time import * # only import what is needed. for example, 'form time import perf_counter'
form typing import Generic # unused import
from re import DFA # no object called 'DFA' exists in re
from numpy import * # if you use many different things from a module, consider importing the module (qualified import): 'import numpy' or 'import numpy as np'
import pandas as pandas # useless renaming using 'as'
import cyclic
...
- cyclic.py
import incorrect_imports # this will cause a cyclic import. python knows how to handle this most of the time, but it WILL cause some errors in certain situations
Global variables
While global variables may not cause issues in isolated cases, when producing code which is meant to be used by other codebases, like the case of our project, the use of global variables may cause issues in multithreaded or multiprocess applications. As such, the use of such variables is strongly discouraged. Exceptions to these cases are global constant definitions and type variables and aliases.
PI = 3.1416 # constants are a good use of global variables
T = typeVar('T') # typeVars will almost always need to be global
def GenericClass(Generic[T]):
pass
global_stack = []
def exported_function(param):
global global_stack
global_stack = []
fill_global_stack_with_possible_solutions(param)
for sol in global_stack:
check_solution(sol, param)
# if exported_function is called by 2 or more concurrent threads, they may interfere with each other, and there is a strong possibility that it will be called like that
Redundant and trivially simplifiable code
This includes unreachable code, expressions with no effect, unused variables, redundant code and expressions written in an overcomplicated manner (using extra unneeded statements or harder-to-read versions of statements, such as dunders instead of builtins).
inport time # this import is not used
def f():
x = 1 # this value is never used
1+2 # this statement has no effect
y = 1
y = y # this statement is redundant
return True
y += 1 # this code is unreachable
def g(z): # z is not used
if 1 < 2: # two errors: constant comparison (the condition always evaluates to True, so it can be safely replaced with True), and constant if condition (the 'then' branch is always executed, so the if can be removed)
# the following if can be easily simplified to 'return f()' or 'return bool(f())' if the result is not guaranteed to be a bool but is interpreted as such
if f():
return True
else:
return False
h = lambda x: x + 1 # use a more easily readable function def
t = (lambda x: x * 2)(4) # do not define a lambda just to immediatly call it, use the expression 4 * 2 directly (or, in this case, just write 8)
l = [1,2,3]
if not not 1 in l: # double not is redundant
print(not 1 < 2) # replace not less than with >=
s = {x for x in l} # do not use comprehensions, just use the built-in set() constructor
for i in range(len(l)):
print(l[i]) # do not iterate over range, then index the list, just iterate over the list (use enumerate if you also need the index)
def reurns_nothing():
print("imagine this changed some state")
return # this return is not needed
while True:
if some_condition(): # use 'while some_condition():' instead
break
print("do something")
def my_min(ls): # do not do this, python already has min, max, any, all, sum implemented (along with many others)
m = ls[0]
for x in ls[1:]:
if x < m:
m = x
return x
class A:
pass
a = A()
if a.__class__ == A: # use isinstance or match statements
print(a.__str__()) # use str(a)
Complexity
Your functions and methods should be modular and it should be easy to understand their control flow. We will look at the following metrics: statements per function, number of branches and loops (cyclomatic complexity).
Another side of this is that duplicate or very similar sequences of code should be refactored into functions or methods if possible.
Return consistency
All return statements of a function should return an exppression or none of them should
def f():
if some_condition():
return True
elif some_other_condition():
return # because the other return yields a value (True), this should explicitly return a value as well
# if both conditions are false, an implicit bare return exists here, which should be replaced by a return with a value
Dangerous default values
You should be careful when using default values for function/method parameters. Object values (such as list/set/dict literals) will be reused between calls and may lead to unexpected behaviour
def f(l = [])
l.append(1)
print(l)
f() # prints '1'
f() # prints '1 1'
f() # prints '1 1 1'
def f(l = None)
if l is None:
l = []
l.append(1)
print(l)
f() # prints '1'
f() # prints '1'
f() # prints '1'
These are cases in which your code is readable and without major performance issues, but could be improved by using more specific or clear syntax.
Having very few of these suggestions left unimplemented will grant you a bonus of up to 0.1 points per stage
(meaning you will be able to reach 1.1 points on each stage).
This bonus is only granted if no penalty is given from error-prone practices.
Comprehensions
You can use comprehensions instead of for loops, such as the ones described here to more succintly perform operations resulting in lists, sets or dictionaries by iterating on other objects, especially maps, filters, cartesian products, and flattens.
L = [1,2,3,4]
S = {
['a', 'b'],
['c', 'd']
}
D = {'a':1, 'b': 2, 'c':3}
# maps
double_l = []
for x in L:
l1.append(x * 2)
reverse_d = {}
for k,v in D.items():
reverse_d[v] = k
# filter
even_l = []
for x in L:
if x % 2 == 0:
even_l.append(x)
# cartesian product
all_pairs = []
for x in L:
for y in S:
all_pairs.append((x,y))
# inner object iteration (flattening)
flat_s = set()
for l in S:
for x in l:
flat_s.append(x)
# combined operations
comb = []
for x in L:
if x % 2 == 0:
for l in S:
for y in l:
comb.append(y * x)
equivalent comprehensions
equivalent comprehensions
L = [1,2,3,4]
S = {
['a', 'b'],
['c', 'd']
}
D = {'a':1, 'b': 2, 'c':3}
double_l = [x * 2 for x in L]
reverse_d = {v:k for k,v in D.items()}
even_l = [ x
for x in L
if x % 2 == 0
]
all_pairs = [(x, y)
for x in L
for y in S
]
flat_s = { x
for l in S
for x in l
}
comb = [ x * y
for x in L
if x % 2 == 0
for l in S
for y in l
]
Swapping variables
In python you can use a, b = b, a
to swap the values of 2 variables a and b instead of using temporary variables.
Type hints
Type hints can help others (including yourselves if you come back to the code after a small break) better understand your code, and allows software tools give you meaningful feedback (autocomplete/intellisense, live type errors and suggestions). It is not necessary to give hints to every single variable you define, but it is recommended to use type hints on functions, methods (both parameters and return types) and class members, and use generics whenever applicable.