Edit this page Backlinks This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== 4. Regex Representation in Python ====== In this laboratory we will be using **Python3** classes to introduce a possible representation of regular expressions inside code and ways to work with those Regexes. We will start with a base ''class Regex'' that will function as an interface for future descendents: <code python> class Regex: '''Base class for Regex ADT''' def __str__(self) -> str: '''Returns the string representation of the regular expression''' pass def gen(self) -> [str]: '''Return a representative set of strings that the regular expression can generate''' pass def eval_gen(self): '''Prints the set of strings that the regular expression can generate''' pass </code> We remark the following aspects: <note> There is no ''%%__init__(self)%%'' blueprint, implying that descendants of ''Regex'' could be instantiated with ''Descendent()'' if we do not specifically overwrite this default functionality. As we have mentioned in the [second laboratory]("https://ocw.cs.pub.ro/ppcarte/doku.php?id=lfa:2024:lab02") the corespondences with ''Java'' are: * ''self'' $ \rightarrow $ ''this'' * ''%%__init__%%'' $ \rightarrow $ Class Constructor * ''%%__str__(self)%%'' $ \rightarrow $ ''toString()'' </note> <note> We have structures of type ''%%'''comment'''%%''. This is the preferred way of writing documentation for classes and functions in Python, as the structure produces a help dialogue box when hovering over functionalities with this type of comments. </note> <note> We have interfaced the functionalities for the methods: ''%%__str__(self) -> str%%'', ''%%gen(self) -> [str]%%'', ''%%eval_gen(self)%%''. By using the keyword ''pass'' we can create empty function/class definitions, otherwise the Python3 interpreter would throw an error for empty structures. </note> <note> Although Python is dynamically typed, we still encourage you to **write the types for parameters and outputs explicitly**, as they **contribute to documenting the code**. Further, when writing python code for interviews, **the employers usually follow this aspect and grade you in consequence**. </note> ==== Your Task ==== Your task is to implement the missing pieces of code from the regex.py file following the comments in the TODOS: <code python> # This is an auxiliary method used to generate our strings by length and alphabetically # E.g: (b | aa)* = [b, aa, bb, aab, bbb, aaaa, ...] def convert_to_sorted_set(lst: [str]) -> [str]: '''Converts a list to a sorted set''' crt_lst = list(set(lst)) return sorted(crt_lst, key = lambda x: (len(x), x)) # Global variable used to threshold the number of Star items INNER_STAR_NO_ITEMS = 3 class Regex: '''Base class for Regex ADT''' def __str__(self) -> str: '''Returns the string representation of the regular expression''' pass def gen(self) -> [str]: '''Return a representative set of strings that the regular expression can generate''' pass def eval_gen(self): '''Prints the set of strings that the regular expression can generate''' pass # TODO 0: Implementati Clasa Void dupa blueprint-ul de mai sus class Void(Regex): '''Represents the empty regular expression''' # Va returna Void ca string def __str__(self) -> str: pass # Va genera un obiect din Python corespunzatoar clasei void def gen(self) -> [str]: pass def eval_gen(self): print("Void generates nothing") # TODO 1: Implementati Clasa Epsilon-String (Empty) dupa blueprint-ul de mai sus class Empty(Regex): '''Represents the empty string regular expression''' # Va returna Empty ca string def __str__(self) -> str: pass # Va returna o lista corespunzatoare a ce stringuri produce sirul vid def gen(self) -> [str]: pass def eval_gen(self): print("Empty string generates ''.") # TODO 2: Implementati functionalitatile necesare pentru Clasa Symbol dupa blueprint-ul de mai sus class Symbol(Regex): '''Represents a symbol in the regular expression''' def __init__(self, char: str): self.char = char # TODO 2: Completati metodele __str__ si gen pentru clasa Symbol def __str__(self) -> str: pass def gen(self) -> [str]: pass def eval_gen(self): self.gen() # Dorim sa pastram in atributul words al clasei curente stringurile generate cu gen(self) return "Symbol {self.char} generates {self.words}" # TODO 3: Implementati functionalitatile necesare pentru Clasa Union dupa blueprint-ul de mai sus class Union(Regex): '''Represents the union of two regular expressions''' def __init__(self, *arg: [Regex]): self.components = arg # TODO 3: Completati metodele __str__, gen si eval_gen pentru clasa Union def __str__(self) -> str: pass def gen(self) -> [str]: pass # Dupa modelul de la Symbol, vom dori ca urmatoarele eval sa implementeze # return "Class {varianta toString() a clasei) generates {self.words}" def eval_gen(self) -> str: pass # TODO 4: Implementati functionalitatile necesare pentru Clasa Concat dupa blueprint-ul de mai sus class Concat(Regex): '''Represents the concatenation of two regular expressions''' def __init__(self, *arg : [Regex]): self.components = arg # TODO 4: Completati metodele __str__, gen si eval_gen pentru clasa Concat def __str__(self) -> str: pass def gen(self) -> [str]: pass def eval_gen(self) -> str: pass # TODO 5: Implementati functionalitatile necesare pentru Clasa Star dupa blueprint-ul de mai sus class Star(Regex): '''Represents the Kleene star (zero or more repetitions) of a regular expression''' def __init__(self, regex: Regex): self.regex = regex self.base_words = [""] self.base_gen() self.words = [] # TODO 5: Completati metodele __str__, gen, eval_gen pentru clasa Star def __str__(self): pass def base_gen(self): '''To memorize the base words generated by the regex inside the star, we store them in a list''' if type(self.regex) == Star: # Implement Star.gen with a limiting threshold pass # Implement Regex.gen for the rest pass def gen(self, no_items = 10): pass def eval_gen(self, no_items = 10): pass if __name__ == "__main__": # Example usage r1 = Symbol('a') r2 = Symbol('b') regex_union = Union(r1, r2) regex_union.eval_gen() e2 = Union(Symbol('a'),Symbol('b'), Symbol('c')) e4 = Concat(r1, e2) e4.eval_gen() e5 = Concat(e2, e2, r2) e5.eval_gen() star_ex = Star(r1) star_ex.eval_gen() regex_concat = Concat(regex_union, star_ex) regex_concat.eval_gen() last_expr = Concat(Star(Union(Symbol('a'), Symbol('b'))), Symbol('b'), Star(Symbol('c'))) last_expr.eval_gen() </code> OBSERVATII DESPRE CUM ARATA ASTEA -> CONCAT STAR ETC EXEMPLU DE OUTPUT PENTRU MAIN DE SCRIS DESPRE CUM DACA PREFERA ALTFEL SA IMPLEMENTEZE CHAR, LEGATURA CU UNION SI CONCAT ( DE CE E CHAR LISTA), A*, SI NR DE PARAMETRII IN CONCATENARI/UNION POT FACE CUM VOR EI.