Differences

This shows you the differences between two versions of the page.

--- lfa:lab [2018/10/02 10:32]
pdmatei
+++ lfa:lab [2018/10/02 11:20] (current)
pdmatei
@@ Line 58: / Line 58: @@
 ==== Installing JFlex ====
-A complete, platform-dependent set of installation instructions can be found [[http://jflex.de/installing.html| here]]
+A complete, platform-dependent set of installation instructions can be found [[http://jflex.de/installing.html| here]]. In a nutshell, JFlex comes as a binary app ''jflex''.
 ==== The structure of a flex file ====
-==== Writing a Hello World project ====
+Consider the following simple JFlex file:
+<code java>
+import java.util.*;
+%%
+%class HelloLexer
+%standalone
+%{
+  public Integer words = 0;
+%}
+LineTerminator = \r|\n|\r\n
+%%
+[a-zA-Z]+ { words+=1; }
+{LineTerminator} { /* do nothing*/ }
+</code>
+Suppose the above file is called ''Hello.flex''. Running the command ''jflex Hello.flex'' will generate a Java class which implements a lexer.
+Each JFlex file (such as the above), contains 5 sections:
+  * the first section, which ends at the first occurrence of ''\%\% '' contains declarations which will be added at the beginning of the Java class file.
+  * the second section, right after ''%%'' and until ''%{'' contains a sequence of options for jflex. Here, we use two options:
+      * ''class HelloLexer'' tells jflex that the output java class that the lexer classname should be ''HelloLexer''
+      * ''standalone'' tells jflex to print the unmatched input word at to standard output and continue scanning.
+      * More details regarding possible options can be found in the [[http://jflex.de/manual.pdf|JFlex docs]].
+  * the third section, separated by ''%{'' and ''%}'' contains declarations which will be appended in the Lexer class file. Here we declare a public variable ''words''.
+  * the fourth section contains regular expression **declarations**. Here, we have declared ''LineTerminator'' to be the regular expression ''\r | \n | \r\n''. Declarations can be use to build more complicated RegExps from simple ones, and can be used as well in the fifth section of the flex file:
+  * the fifth section contains rules and actions: a rule specifies a regular expression to be scanned, as well as the appropriate action to be taken, when a word satisfying the regexp is found:
+    * the rule ''[a-zA-Z]+ { words+=1; }'' states that whenever ''[a-zA-Z]+'' (a regexp defined inline) is matched by a word, ''words+=1;'' should be executed;
+    * the rule ''{LineTerminator} { /* do nothing*/ }'' refers to the regexp defined above (note the brackets); here no action should be executed;
+    * JFlex will always scan for the **longest** input word which satisfies a regexp. When a word satisfies more than one regexp the **first** one from the flex file will be matched.
+==== Compiling a Hello World project ====
+After performing:
+<code>
+jflex Hello.flex
+</code>
+we obtain ''HelloLexer.java'' which contains the ''HelloLexer'' public class implementing our lexer. We can easily include this class in our project, e.g.:
+<code java>
+import java.io.*;
+import java.util.*;
+public class Hello {
+  public static void main (String[] args) throws IOException {
+    HelloLexer l = new HelloLexer(new FileReader(args[0]));
+    l.yylex();
+    System.out.println(l.words);
+  }
+}
+</code>
+  * Note that the lexer constructor method receives a java Reader as input (other options are possible, see the docs), and we take the name of the file to-be-scanned from standard input.
+  * Each lexer implements the method ''yylex'' which starts the scanning process.
+After compiling:
+<code>
+javac HelloLexer.java Hello.java
+</code>
+and running:
+<code>
+java Hello
+</code>
+we obtain:
+<code>
+
+</code>
+at standard output.
+Recall that the option ''standalone'' tells the lexer to print unmatched words. In our example, those unmatched words are whitespaces.
 ==== Application - parsing lists ====
+Consider the following BNF grammar which describes lists:
+<code>
+<integer> ::= [0-9]+
+<op> ::= "++" | ":"
+<element> ::= <integer> | <op> | <list>
+<sequence> ::= <element> | <element> " " <sequence>
+<list> ::= | "()" | "(" <sequence> ")"
+</code>
+The following are examples of lists:
+<code>
+(1 2 3)
+(1 (2 3) 4 ())
+(1 (++ (: 2 (3)) (4 5)) 6)
+</code>
+Your task is to:
+  * correctly parse such lists:
+    * write a JFlex file to implement the lexer:
+      * Since the language describing lists is Context Free, in order to parse a list, you need to keep track of the opened/closed parenthesis.
+      * Start by write a PDA (on paper) which accepts correctly-formed lists. Treat each regular expression you defined (for numbers and operators) as a single symbol;
+      * Implement the PDA (strategy) in the lexer file;
+  * given a correctly-defined list, write a procedure which evaluates lists operations (in the standard way); For instance, ''(1 (++ (: 2 (3)) (4 5)) 6)'' evaluates to ''(1 (2 3 4 5) 6)''
+  * write a procedure which checks if a list is **semantically valid**. What type of checks do you need to implement?