Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
lfa:lab [2018/09/26 12:52]
pdmatei
lfa:lab [2018/10/02 11:20] (current)
pdmatei
Line 1: Line 1:
 ====== Lab ====== ====== Lab ======
 +
 +
  
 ===== Lab-01 - Expresii regulate ===== ===== Lab-01 - Expresii regulate =====
Line 6: Line 8:
  
 Key insights: Key insights:
- * more languages than regular expressions +  ​* more languages than regular expressions 
- * we do not know how to write regular expressions for some languages (as a direct consequence of the above) +  * we do not know how to write regular expressions for some languages (as a direct consequence of the above) 
- * reg.exps. are unambiguous and FINITE language representations+  * reg.exps. are unambiguous and FINITE language representations
  
  
 Objectives: Objectives:
- * Understand what is a language and a regular expression is, and the relation between them; +  ​* Understand what is a language and a regular expression is, and the relation between them; 
- * Write several regular expressions for designated languages +  * Write several regular expressions for designated languages 
- * Identify languages described by some regular expressions (e.g. ?!)+  * Identify languages described by some regular expressions (e.g. ?!)
  
 Resources (tentative):​ Resources (tentative):​
- * http://​www.idt.mdh.se/​kurser/​cd5560/​10_01/​examination/​KOMPENDIER/​Regular/​kompendium_eng.pdf+  ​* http://​www.idt.mdh.se/​kurser/​cd5560/​10_01/​examination/​KOMPENDIER/​Regular/​kompendium_eng.pdf 
 + 
 +=== Exercises === 
 +**I. What is the regular expression for the following languages:​** 
 + 
 +  * $math[\Sigma=\{0,​ 1\}], $math[L=\{011\}] 
 +//​Solution://​ $math[E=011] \\ 
 +//Obs:// By definition the correct expression is $math[E=((01)1)],​ but we won't write them when not needed and we use a precedence rule to reduce the number of parantheses in regular expressions as much as possible (Kleene Star > Concatenation > Union). 
 +  * $math[\Sigma=\{a,​ b\}], $math[L=\{a,​ b\}] 
 +//​Solution://​ $math[E=a \cup b] \\ 
 +//Obs:// By definition the correct expression is $math[E=(a \cup b)], but we can remove parentheses for the same reason as above. 
 +  * $math[\Sigma=\{0,​ 1\}], $math[L=\{e,​ 0, 1, 00, 01, 10, 11, 000, ...\}] 
 +//​Solution://​ $math[E=(0 \cup 1)^{*}] \\ 
 +  * $math[\Sigma=\{0,​ 1\}], $math[L=\{010010101000010,​ 010010101000011\}] 
 +//​Solution://​ $math[E=01001010100001(0 \cup 1)] \\ 
 +  * $math[\Sigma=\{0,​ 1\}], $math[L=\{w \in \Sigma^{*} \mid w \text{ ends with } 0\}] \\ 
 +//​Solution://​ $math[E=(0 \cup 1)^{*}0] \\ 
 +  * $math[\Sigma=\{0,​ 1\}], $math[L=\{w \in \Sigma^{*} \mid w=w_101 \lor w=1w_1, w_1 \in \Sigma^{*}\}] \\ 
 +//​Solution://​ $math[E=(0 \cup 1)^{*}01 \cup 1(0 \cup 1)] \\ 
 +  * $math[\Sigma=\{x,​ y\}], $math[L=\{w \in \Sigma^{*} \mid \#_x(w) = 2\}] \\ 
 +//​Solution://​ $math[E=y^{*}xy^{*}xy^{*}] \\ 
 +//Obs:// $math[\#​_x(w)] denotes number of $math[x] in $math[w]. 
 +  * $math[\Sigma=\{a,​ b\}], $math[L=\{w \in \Sigma^{*} \mid \#_a(w) \,\vdots\, 2\}] \\ 
 +//​Solution://​ $math[E=(b^{*}ab^{*}ab^{*})^{*}] \\ 
 +  * $math[\Sigma=\{x,​ y\}], $math[L=\{w \in \Sigma^{*} \mid \#_x(w) \ge 1\}] \\ 
 +//​Solution://​ $math[E=(x \cup y)^{*}x(x \cup y)^{*}] \\ 
 +  * $math[\Sigma=\{a,​ b, c\}], $math[L=\{w \in \Sigma^{*} \mid \#_a(w) \ge 1 \land \#_c(w) \ge 1\}] \\ 
 +//​Solution://​ $math[E=((a \cup b \cup c)^{*}a(a \cup b \cup c)^{*}b(a \cup b \cup c)^{*}) \cup ((a \cup b \cup c)^{*}b(a \cup b \cup c)^{*}a(a \cup b \cup c)^{*})] \\ 
 +  * $math[\Sigma=\{a,​ b\}], $math[L=\{w \in \Sigma^{*} \mid w \text{ does not contain } ba\}] \\ 
 +//​Solution://​ $math[E=a^{*}b^{*}] \\ 
 +  * $math[\Sigma=\{a,​ b\}], $math[L=\{w \in \Sigma^{*} \mid \#_a(w) + \#_b(w) = 0\}] \\ 
 +//​Solution://​ $math[E=\epsilon] \\ 
 +  * $math[\Sigma=\{a,​ b\}], $math[L=\{w \in \Sigma^{*} \mid \#_a(w) + \#_b(w) < 0\}] \\ 
 +//​Solution://​ $math[E=\emptyset] \\ 
 + 
 +===== Lab x - JFlex ===== 
 + 
 +==== Installing JFlex ==== 
 + 
 +A complete, platform-dependent set of installation instructions can be found [[http://​jflex.de/​installing.html| here]]. In a nutshell, JFlex comes as a binary app ''​jflex''​. 
 + 
 +==== The structure of a flex file ==== 
 + 
 +Consider the following simple JFlex file: 
 +<code java> 
 +import java.util.*;​ 
 + 
 +%% 
 + 
 +%class HelloLexer 
 +%standalone 
 + 
 +%{ 
 +  public Integer words = 0; 
 +%} 
 + 
 +LineTerminator = \r|\n|\r\n 
 + 
 +%%    
 + 
 +[a-zA-Z]+ { words+=1; } 
 +{LineTerminator} { /* do nothing*/ } 
 +</​code>​ 
 + 
 +Suppose the above file is called ''​Hello.flex''​. Running the command ''​jflex Hello.flex''​ will generate a Java class which implements a lexer. 
 + 
 +Each JFlex file (such as the above), contains 5 sections: 
 +  * the first section, which ends at the first occurrence of ''​\%\% ''​ contains declarations which will be added at the beginning of the Java class file. 
 +  * the second section, right after ''​%%''​ and until ''​%{''​ contains a sequence of options for jflex. Here, we use two options: 
 +      * ''​class HelloLexer''​ tells jflex that the output java class that the lexer classname should be ''​HelloLexer''​ 
 +      * ''​standalone''​ tells jflex to print the unmatched input word at to standard output and continue scanning. 
 +      * More details regarding possible options can be found in the [[http://​jflex.de/​manual.pdf|JFlex docs]]. 
 +  * the third section, separated by ''​%{''​ and ''​%}''​ contains declarations which will be appended in the Lexer class file. Here we declare a public variable ''​words''​. 
 +  * the fourth section contains regular expression **declarations**. Here, we have declared ''​LineTerminator''​ to be the regular expression ''​\r | \n | \r\n''​. Declarations can be use to build more complicated RegExps from simple ones, and can be used as well in the fifth section of the flex file: 
 +  * the fifth section contains rules and actions: a rule specifies a regular expression to be scanned, as well as the appropriate action to be taken, when a word satisfying the regexp is found: 
 +    * the rule ''​[a-zA-Z]+ { words+=1; }''​ states that whenever ''​[a-zA-Z]+''​ (a regexp defined inline) is matched by a word, ''​words+=1;''​ should be executed; 
 +    * the rule ''​{LineTerminator} { /* do nothing*/ }''​ refers to the regexp defined above (note the brackets); here no action should be executed; 
 +    * JFlex will always scan for the **longest** input word which satisfies a regexp. When a word satisfies more than one regexp the **first** one from the flex file will be matched. 
 + 
 +==== Compiling a Hello World project ==== 
 + 
 +After performing:​ 
 +<​code>​ 
 +jflex Hello.flex 
 +</​code>​ 
 + 
 +we obtain ''​HelloLexer.java''​ which contains the ''​HelloLexer''​ public class implementing our lexer. We can easily include this class in our project, e.g.: 
 + 
 +<code java> 
 +import java.io.*;​ 
 +import java.util.*;​ 
 + 
 +public class Hello { 
 +  public static void main (String[] args) throws IOException { 
 +    HelloLexer l = new HelloLexer(new FileReader(args[0]));​ 
 + 
 +    l.yylex();​ 
 + 
 +    System.out.println(l.words);​ 
 + 
 +     
 +  } 
 +
 +</​code>​ 
 +  * Note that the lexer constructor method receives a java Reader as input (other options are possible, see the docs), and we take the name of the file to-be-scanned from standard input. 
 +  * Each lexer implements the method ''​yylex''​ which starts the scanning process. 
 + 
 +After compiling:​ 
 +<​code>​ 
 +javac HelloLexer.java Hello.java 
 +</​code>​ 
 + 
 +and running: 
 + 
 +<​code>​ 
 +java Hello 
 +</​code>​ 
 + 
 +we obtain: 
 +<​code>​ 
 +  
 +  
 + 
 + 6 
 +</​code>​ 
 +at standard output. 
 + 
 +Recall that the option ''​standalone''​ tells the lexer to print unmatched words. In our example, those unmatched words are whitespaces. 
 + 
 +==== Application - parsing lists ==== 
 + 
 +Consider the following BNF grammar which describes lists: 
 +<​code>​ 
 +<​integer>​ ::= [0-9]+ 
 +<op> ::= "​++"​ | ":"​ 
 +<​element>​ ::= <​integer>​ | <op> | <​list>​ 
 +<​sequence>​ ::= <​element>​ | <​element>​ " " <​sequence>​  
 +<​list>​ ::= | "​()"​ | "​("​ <​sequence>​ "​)"​ 
 +</​code>​ 
 + 
 +The following are examples of lists: 
 +<​code>​ 
 +(1 2 3) 
 +(1 (2 3) 4 ()) 
 +(1 (++ (: 2 (3)) (4 5)) 6) 
 +</​code>​ 
 + 
 +Your task is to: 
 +  * correctly parse such lists: 
 +    * write a JFlex file to implement the lexer: 
 +      * Since the language describing lists is Context Free, in order to parse a list, you need to keep track of the opened/​closed parenthesis.  
 +      * Start by write a PDA (on paper) which accepts correctly-formed lists. Treat each regular expression you defined (for numbers and operators) as a single symbol; 
 +      * Implement the PDA (strategy) in the lexer file; 
 +  * given a correctly-defined list, write a procedure which evaluates lists operations (in the standard way); For instance, ''​(1 (++ (: 2 (3)) (4 5)) 6)''​ evaluates to ''​(1 (2 3 4 5) 6)''​ 
 +  * write a procedure which checks if a list is **semantically valid**. What type of checks do you need to implement?