Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
lfa:lab [2018/09/27 20:53] lfa [Lab-01 - Expresii regulate] |
lfa:lab [2018/10/02 11:20] (current) pdmatei |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Lab ====== | ====== Lab ====== | ||
+ | |||
+ | |||
===== Lab-01 - Expresii regulate ===== | ===== Lab-01 - Expresii regulate ===== | ||
Line 51: | Line 53: | ||
* $math[\Sigma=\{a, b\}], $math[L=\{w \in \Sigma^{*} \mid \#_a(w) + \#_b(w) < 0\}] \\ | * $math[\Sigma=\{a, b\}], $math[L=\{w \in \Sigma^{*} \mid \#_a(w) + \#_b(w) < 0\}] \\ | ||
//Solution:// $math[E=\emptyset] \\ | //Solution:// $math[E=\emptyset] \\ | ||
+ | |||
+ | ===== Lab x - JFlex ===== | ||
+ | |||
+ | ==== Installing JFlex ==== | ||
+ | |||
+ | A complete, platform-dependent set of installation instructions can be found [[http://jflex.de/installing.html| here]]. In a nutshell, JFlex comes as a binary app ''jflex''. | ||
+ | |||
+ | ==== The structure of a flex file ==== | ||
+ | |||
+ | Consider the following simple JFlex file: | ||
+ | <code java> | ||
+ | import java.util.*; | ||
+ | |||
+ | %% | ||
+ | |||
+ | %class HelloLexer | ||
+ | %standalone | ||
+ | |||
+ | %{ | ||
+ | public Integer words = 0; | ||
+ | %} | ||
+ | |||
+ | LineTerminator = \r|\n|\r\n | ||
+ | |||
+ | %% | ||
+ | |||
+ | [a-zA-Z]+ { words+=1; } | ||
+ | {LineTerminator} { /* do nothing*/ } | ||
+ | </code> | ||
+ | |||
+ | Suppose the above file is called ''Hello.flex''. Running the command ''jflex Hello.flex'' will generate a Java class which implements a lexer. | ||
+ | |||
+ | Each JFlex file (such as the above), contains 5 sections: | ||
+ | * the first section, which ends at the first occurrence of ''\%\% '' contains declarations which will be added at the beginning of the Java class file. | ||
+ | * the second section, right after ''%%'' and until ''%{'' contains a sequence of options for jflex. Here, we use two options: | ||
+ | * ''class HelloLexer'' tells jflex that the output java class that the lexer classname should be ''HelloLexer'' | ||
+ | * ''standalone'' tells jflex to print the unmatched input word at to standard output and continue scanning. | ||
+ | * More details regarding possible options can be found in the [[http://jflex.de/manual.pdf|JFlex docs]]. | ||
+ | * the third section, separated by ''%{'' and ''%}'' contains declarations which will be appended in the Lexer class file. Here we declare a public variable ''words''. | ||
+ | * the fourth section contains regular expression **declarations**. Here, we have declared ''LineTerminator'' to be the regular expression ''\r | \n | \r\n''. Declarations can be use to build more complicated RegExps from simple ones, and can be used as well in the fifth section of the flex file: | ||
+ | * the fifth section contains rules and actions: a rule specifies a regular expression to be scanned, as well as the appropriate action to be taken, when a word satisfying the regexp is found: | ||
+ | * the rule ''[a-zA-Z]+ { words+=1; }'' states that whenever ''[a-zA-Z]+'' (a regexp defined inline) is matched by a word, ''words+=1;'' should be executed; | ||
+ | * the rule ''{LineTerminator} { /* do nothing*/ }'' refers to the regexp defined above (note the brackets); here no action should be executed; | ||
+ | * JFlex will always scan for the **longest** input word which satisfies a regexp. When a word satisfies more than one regexp the **first** one from the flex file will be matched. | ||
+ | |||
+ | ==== Compiling a Hello World project ==== | ||
+ | |||
+ | After performing: | ||
+ | <code> | ||
+ | jflex Hello.flex | ||
+ | </code> | ||
+ | |||
+ | we obtain ''HelloLexer.java'' which contains the ''HelloLexer'' public class implementing our lexer. We can easily include this class in our project, e.g.: | ||
+ | |||
+ | <code java> | ||
+ | import java.io.*; | ||
+ | import java.util.*; | ||
+ | |||
+ | public class Hello { | ||
+ | public static void main (String[] args) throws IOException { | ||
+ | HelloLexer l = new HelloLexer(new FileReader(args[0])); | ||
+ | |||
+ | l.yylex(); | ||
+ | |||
+ | System.out.println(l.words); | ||
+ | |||
+ | | ||
+ | } | ||
+ | } | ||
+ | </code> | ||
+ | * Note that the lexer constructor method receives a java Reader as input (other options are possible, see the docs), and we take the name of the file to-be-scanned from standard input. | ||
+ | * Each lexer implements the method ''yylex'' which starts the scanning process. | ||
+ | |||
+ | After compiling: | ||
+ | <code> | ||
+ | javac HelloLexer.java Hello.java | ||
+ | </code> | ||
+ | |||
+ | and running: | ||
+ | |||
+ | <code> | ||
+ | java Hello | ||
+ | </code> | ||
+ | |||
+ | we obtain: | ||
+ | <code> | ||
+ | |||
+ | |||
+ | |||
+ | 6 | ||
+ | </code> | ||
+ | at standard output. | ||
+ | |||
+ | Recall that the option ''standalone'' tells the lexer to print unmatched words. In our example, those unmatched words are whitespaces. | ||
+ | |||
+ | ==== Application - parsing lists ==== | ||
+ | |||
+ | Consider the following BNF grammar which describes lists: | ||
+ | <code> | ||
+ | <integer> ::= [0-9]+ | ||
+ | <op> ::= "++" | ":" | ||
+ | <element> ::= <integer> | <op> | <list> | ||
+ | <sequence> ::= <element> | <element> " " <sequence> | ||
+ | <list> ::= | "()" | "(" <sequence> ")" | ||
+ | </code> | ||
+ | |||
+ | The following are examples of lists: | ||
+ | <code> | ||
+ | (1 2 3) | ||
+ | (1 (2 3) 4 ()) | ||
+ | (1 (++ (: 2 (3)) (4 5)) 6) | ||
+ | </code> | ||
+ | |||
+ | Your task is to: | ||
+ | * correctly parse such lists: | ||
+ | * write a JFlex file to implement the lexer: | ||
+ | * Since the language describing lists is Context Free, in order to parse a list, you need to keep track of the opened/closed parenthesis. | ||
+ | * Start by write a PDA (on paper) which accepts correctly-formed lists. Treat each regular expression you defined (for numbers and operators) as a single symbol; | ||
+ | * Implement the PDA (strategy) in the lexer file; | ||
+ | * given a correctly-defined list, write a procedure which evaluates lists operations (in the standard way); For instance, ''(1 (++ (: 2 (3)) (4 5)) 6)'' evaluates to ''(1 (2 3 4 5) 6)'' | ||
+ | * write a procedure which checks if a list is **semantically valid**. What type of checks do you need to implement? | ||
+ |