Consider the following language encoding expressions:
10.1.1. Implement an AST for expressions.
parse_whitespace(“ lfa”) = “lfa”
parse_whitespace(“lfa”) = None
Another example:
stack = [] # def parse_digit(w): if len(w) == 0: return None # parsing fails if w[0].isalphanum(): stack.append(w[0]) # add the parsed digit to the stack return w[1:] # return the rest of the word else: return None # if the character is not a digit, the parsing fails
10.1.3. Implement a function parse_plus
which parses the character '+' (if the first character is '+', it consumes it, otherwise it fails). Hint: use a more general function which you can then reuse to parse other characters.
10.1.4. We can build more complex parsers from simpler ones. The key is to try to parse expressions and if parsing fails, we can try a different alternative.
Complete the following implementation of the function parse_multiplication
:
def parse_multiplication(w): if len(w) == 0: return None w1 = parse_digit(w) # parse a digit if w1 != None: # we have parsed a digit, now we try to parse '+': w2 = parse_plus(w1) if w2 != None: # we have successfully parsed a '+' w3 = parse_multiplication(w2) if w3 != None: # we have parsed a digit followed by + and by another multiplication expression # what are the contents of the stack right now? # how should the stack be modified? else: # parsing a '+' has failed, so we just return the rest of the string w1 return w1 else: return None # parsing a digit failed
10.1.5. Following the same structure, write a complete implementation for expression parsers.
10.2.1. Write a grammar which accurately describes regular expressions. Consider the following definition: A regular expression is built in the normal way, using the symbols (,),*,| and any other alpha-numeric character. Free spaces may occur freely within the expression.
10.2.2. Starting from the solution to the previous exercise, write an unambiguous grammar for regexes:
10.2.3. Write a parser for regular expressions.