Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
lfa:lab03-lexers [2021/09/28 10:41] pdmatei |
lfa:lab03-lexers [2021/10/25 11:46] (current) stefan.stancu |
||
---|---|---|---|
Line 8: | Line 8: | ||
---- | ---- | ||
- | **3.1.1.** Suppose $math[A_1] is a DFA and w=''aabaaabb'' is a word. Find the **longest prefix** which is accepted by $math[A_1]. | + | **3.1.1.** Suppose $math[A_1] is a DFA and w=''aabaaabb'' is a word. Find the **longest prefix** of w which is accepted by $math[A_1]. |
- | {{ :lfa:lexer-a1.png?200 |}} | + | {{ :lfa:lexer-a1.png?300 |}} |
---- | ---- | ||
Line 16: | Line 16: | ||
---- | ---- | ||
- | **3.1.2.** Split the following word $math[w]=''ababbaabbabaab'' using $math[A_2] as the unique spec. | + | **3.1.2.** Split the following word $math[w]=''ababbaabbabaab'' using $math[A_2] as the unique token. |
- | {{ :lfa:lexer-a2.png?200 |}} | + | {{ :lfa:lexer-a2.png?300 |}} |
**3.1.3.** Given DFAs $math[A_3], $math[A_4] and $math[A_5], use them to split the word $math[w]=''abaaabbabaaaab'' into lexemes. | **3.1.3.** Given DFAs $math[A_3], $math[A_4] and $math[A_5], use them to split the word $math[w]=''abaaabbabaaaab'' into lexemes. | ||
^^^^ | ^^^^ | ||
- | | {{ :lfa:lexer-a3.png?200 |}} | {{ :lfa:lexer-a4.png?200 |}} |{{ :lfa:lexer-a5.png?200 |}} | | + | | {{ :lfa:lexer-a3.png?300 |}} | {{ :lfa:lexer-a4.png?200 |}} |{{ :lfa:lexer-a5.png? |
+ | 200 |}} | | ||
Line 35: | Line 36: | ||
Let: | Let: | ||
- | * $math[A_1] be a DFA which matches lowercase character sequences (''[a-z]+''), ending with a whitespace (e.g. ''aba '') | + | * $math[A] be a DFA which matches lowercase character sequences (''[a-z]+''), ending with a whitespace (e.g. ''aba '') |
- | * while $math[A_2] matches ''def '' (the four-letter sequence). Let $math[w]=''def deffunction ''. | + | * while $math[B] matches "''def ''" (the four-letter sequence). Let $math[w]="''def deffunction ''". |
Suppose: | Suppose: | ||
- | * $math[A_1] has higher priority than $math[A_2]. How will the string be split? (Which are the lexemes?) | + | * $math[A] has higher priority than $math[B]. How will the string be split? (Which are the lexemes?) |
- | * $math[A_2] has higher priority than $math[A_1]. How will the splitting look like? | + | * $math[B] has higher priority than $math[A]. How will the splitting look like? |
- | * finally, let us return to the **maximal match** principle. How should the DFAs $math[A_1] an $math[A_2] be ordered (w.r.t. priority) so that our word is split in the correct way (assuming a Python syntax)? | + | * finally, let us return to the **maximal match** principle. How should the DFAs $math[A] an $math[B] be ordered (w.r.t. priority) so that our word is split in the correct way (assuming a Python syntax)? |
===== 3.3. Implementation ===== | ===== 3.3. Implementation ===== | ||
- | **3.3.1.** Implement a three-DFA lexer with DFAs $math[A_1], $math[A_2] and $math[A_3]. You can use the code from last lab to directly instantiate the three DFAs. The input should be a word, and the output should be a string of the form ''<token_1>:<lexeme_1> ... <token_n>:<lexeme_n>'', where ''<token_i>'' is the DFA's id (from 1 to 3) and ''<lexeme_i>'' is the matched lexeme. | + | **3.3.1.** Implement a three-DFA lexer with DFAs $math[A_3], $math[A_4] and $math[A_5]. You can use the code from last lab to directly instantiate the three DFAs. The input should be a word, and the output should be a string of the form ''<token_1>:<lexeme_1> ... <token_n>:<lexeme_n>'', where ''<token_i>'' is the DFA's id (from 3 to 5) and ''<lexeme_i>'' is the matched lexeme. |