3. Regular Expressions

This is an old revision of the document!

For each of the exercises from DFA Seminary 1 write a regex describing the same language.

3.1.1. $ L=\{w \in \{0,1\}^* \text{ | w contains an odd number of ones} \} $

3.1.1.

$ 0^*10^*(10^*10^*)^*$

3.1.2. The language of binary words which contain exactly two ones

3.1.2.

$ 0^*10^*10^*$

3.1.3. The language of binary words which encode odd numbers (the last digit is least significative)

3.1.3.

$ (0 \cup 1)^*1$

3.1.4. The set of all binary strings having the substring 00101

3.1.4.

$ (0 \cup 1)^*00101(0 \cup 1)^*$

3.2.1.

$ A=\{ 0^{2k} \mid k \geq 1 \}$

$ B = \{0, \epsilon \}$
$ AB = ? $

Click to display ⇲

Click to hide ⇱

A = {00, 0000, 00000000 …}

AB = {00, 000, 0000, 00000, …} ← this is the cartesian product between the sets(languages) A and B, where the elements of A come first.

where the words in the language that have an even length are obtained by combining a word from A with the word ε from B
and those with an odd length are obtained by combining a word from A with the word 0 from B

3.2.2.

$ A = \{ 0^n 1^n \mid n \geq 1 \}$
$ B = \{ 1^n \mid n \geq 1 \} $
$ AB = ? $
$ BA = ? $

Click to display ⇲

Click to hide ⇱

A is the language in which the words start with zero and end with one and the number of one is equal to the number of zeros (the same value for n is used).

The notation of n in the definition of B is completely unrelated to the n used to define A.

So, B is the language of words made of sequences of ones, having the length of at least 1, so basically B = L(11*).

A = {01, 0011, 000111, 00001111 …}

B = {1, 11, 111 …}

AB = {011, 00111, 000011111, …, 0111, 001111, 00011111, 0000111111, … }

BA = {101, 10011, 1000111, …, 1101, 110011, …, 11101, 1110011 …}

3.2.3

$ A = \{ 0^n 1^n 0^m \mid m \geq n \geq 1 \}$
$ B = \{ 0^n \mid n \geq 1 \} $
$ AB = ? $
$ BA = ? $

Click to display ⇲

Click to hide ⇱

$ AB = \{ 0^n 1^n 0^{m+k} \mid m \geq n \geq 1, k \geq 1 \}$ . Deci $ AB = A$ .

Note that the n in the definition of language A is different from the n in in the definition of B, they are independent when used in defining different sets/languages. However, when n is used several times in the definition of one language, such as the 2 times it appears in language A, it is the same value.

$ BA = \{ 0^{(n+k)} 1^n 0^m \mid m \geq n \geq 1, k \geq 1 \}$ . Equivalently: $ BA = \{0^x 1^y 0^z \mid x \geq y\geq 1 \text{ and } z \geq y \geq 1 \}$

3.2.4.

$ A = ∅ $
$ B = \{ 1^n \mid n \geq 1 \} $
$ AB = ? $
$ A^* = ? $
$ B^* = ? $

Click to display ⇲

Click to hide ⇱

AB = ∅ (because A is empty, so the cartesian product leads to an empty set)

A* = {ε} (epsilon is always part of Kleene star)

B*= {ε} (epsilon is always part of Kleene star) U {$ 1^n $} U {$ 1^{2n} $} U {$ 1^{3n} $} U … So basically B = L( ($1^n$)* )

3. Regular Expressions

3.1. Natural Language / DFA $ \rightarrow $ Regex conversion

3.2. Formation rules (concatenation, reunion, Kleene star)