This is an old revision of the document!
Lab
Lab-01 - Expresii regulate
Key insights:
- more languages than regular expressions
- we do not know how to write regular expressions for some languages (as a direct consequence of the above)
- reg.exps. are unambiguous and FINITE language representations
Objectives:
- Understand what is a language and a regular expression is, and the relation between them;
- Write several regular expressions for designated languages
- Identify languages described by some regular expressions (e.g. ?!)
Resources (tentative):
Exercises
I. What is the regular expression for the following languages:
- $ \Sigma=\{0, 1\}$ , $ L=\{011\}$
Solution: $ E=011$
Obs: By definition the correct expression is $ E=((01)1)$ , but we won't write them when not needed and we use a precedence rule to reduce the number of parantheses in regular expressions as much as possible (Kleene Star > Concatenation > Union).
- $ \Sigma=\{a, b\}$ , $ L=\{a, b\}$
Solution: $ E=a \cup b$
Obs: By definition the correct expression is $ E=(a \cup b)$ , but we can remove parentheses for the same reason as above.
- $ \Sigma=\{0, 1\}$ , $ L=\{e, 0, 1, 00, 01, 10, 11, 000, \ldots\}$
Solution: $ E=(0 \cup 1)^{*}$
- $ \Sigma=\{0, 1\}$ , $ L=\{010010101000010, 010010101000011\}$
Solution: $ E=01001010100001(0 \cup 1)$
- $ \Sigma=\{0, 1\}$ , $ L=\{w \in \Sigma^{*} \mid w \text{ ends with } 0\}$
Solution: $ E=(0 \cup 1)^{*}0$
- $ \Sigma=\{0, 1\}$ , $ L=\{w \in \Sigma^{*} \mid w=w_101 \lor w=1w_1, w_1 \in \Sigma^{*}\}$
Solution: $ E=(0 \cup 1)^{*}01 \cup 1(0 \cup 1)$
- $ \Sigma=\{x, y\}$ , $ L=\{w \in \Sigma^{*} \mid \#_x(w) = 2\}$
Solution: $ E=y^{*}xy^{*}xy^{*}$
Obs: $ \#_x(w)$ denotes number of $ x$ in $ w$ .
- $ \Sigma=\{a, b\}$ , $ L=\{w \in \Sigma^{*} \mid \#_a(w) \,\vdots\, 2\}$
Solution: $ E=(b^{*}ab^{*}ab^{*})^{*}$
- $ \Sigma=\{x, y\}$ , $ L=\{w \in \Sigma^{*} \mid \#_x(w) \ge 1\}$
Solution: $ E=(x \cup y)^{*}x(x \cup y)^{*}$
- $ \Sigma=\{a, b, c\}$ , $ L=\{w \in \Sigma^{*} \mid \#_a(w) \ge 1 \land \#_c(w) \ge 1\}$
Solution: $ E=((a \cup b \cup c)^{*}a(a \cup b \cup c)^{*}b(a \cup b \cup c)^{*}) \cup ((a \cup b \cup c)^{*}b(a \cup b \cup c)^{*}a(a \cup b \cup c)^{*})$
- $ \Sigma=\{a, b\}$ , $ L=\{w \in \Sigma^{*} \mid w \text{ does not contain } ba\}$
Solution: $ E=a^{*}b^{*}$
- $ \Sigma=\{a, b\}$ , $ L=\{w \in \Sigma^{*} \mid \#_a(w) + \#_b(w) = 0\}$
Solution: $ E=\epsilon$
- $ \Sigma=\{a, b\}$ , $ L=\{w \in \Sigma^{*} \mid \#_a(w) + \#_b(w) < 0\}$
Solution: $ E=\emptyset$
Lab x - JFlex
Installing JFlex
A complete, platform-dependent set of installation instructions can be found here. In a nutshell, JFlex comes as a binary app jflex.
The structure of a flex file
Consider the following simple JFlex file:
import java.util.*; %% %class HelloLexer %standalone %{ public Integer words = 0; %} LineTerminator = \r|\n|\r\n %% [a-zA-Z]+ { words+=1; } {LineTerminator} { /* do nothing*/ }
Suppose the above file is called Hello.flex. Running the command jflex Hello.flex will generate a Java class which implements a lexer.
Each JFlex file (such as the above), contains 5 sections:
- the first section, which ends at the first occurrence of
'' contains declarations which will be added at the beginning of the Java class file. * the second section, right after ''and until%{contains a sequence of options for jflex. Here, we use two options:class HelloLexertells jflex that the output java class that the lexer classname should beHelloLexerstandalonetells jflex to print the unmatched input word at to standard output and continue scanning.- More details regarding possible options can be found in the JFlex docs.
- the third section, separated by
%{and%}contains declarations which will be appended in the Lexer class file. Here we declare a public variablewords. - the fourth section contains regular expression declarations. Here, we have declared
LineTerminatorto be the regular expression\r | \n | \r\n. Declarations can be use to build more complicated RegExps from simple ones, and can be used as well in the fifth section of the flex file: - the fifth section contains rules and actions: a rule specifies a regular expression to be scanned, as well as the appropriate action to be taken, when a word satisfying the regexp is found:
- the rule
[a-zA-Z]+ { words+=1; }states that whenever[a-zA-Z]+(a regexp defined inline) is matched by a word,words+=1;should be executed; - the rule
{LineTerminator} { }refers to the regexp defined above (note the brackets); here no action should be executed; - JFlex will always scan for the longest input word which satisfies a regexp. When a word satisfies more than one regexp the first one from the flex file will be matched.
Compiling a Hello World project
After performing:
jflex Hello.flex
we obtain HelloLexer.java which contains the HelloLexer public class implementing our lexer. We can easily include this class in our project, e.g.: