6. Dfa to Regex conversions

6.1. State elimination

Consider the following DFAs:

DFA1 DFA2

Convert the given DFAs to a Regex (using the state-elimination strategy). Hint: is it easier to apply conversion on another DFA?

On what DFA should the algorithm be applied?

On what DFA should the algorithm be applied?

It is recommended to apply the State elimination algorithm on the minimal DFA.

DFA2

DFA2

Step 1:

If there are multiple final states, make them non-final, create a new final state and add ε-transitions to it.

If a final state has any transitions going out of it, create a new final state and add ε-transition.

If the initial state has any transitions going into it, create a new initial state and add ε-transition.

Step 2:

Pick a state that is not initial and not final. For each way to reach its direct successors from its direct predecessors, add a new transition. Then remove the state.

  1. Pick state 1:
    1. in:
      • 0 –(ε)–> 1
      • 2 –(0)–> 1
    2. out:
      • 1 –(ε)–> 4
      • 1 –(1)–> 2
    3. loop:
      • 1 –(0)–> 1
    4. Add new transitions:
      • 2 –(00*)–> 4
      • 2 –(00*1)–> 2
      • 0 –(0*)–> 4
      • 0 –(0*1)–> 2
  2. Pick state 3:
    1. Because state 3 has no direct successor, it can simply be removed without adding new transitions.
  3. Pick state 2:
    1. in:
      • 0 –(0*1)–> 2
    2. out:
      • 2 –(ε U 00*)–> 4
    3. loop:
      • 2 –(00*1)–> 4
    4. Add new transitions:
      • 0 –(0*1(00*1)*(ε U 00*))–> 4
      • This transition can be written more simply: 0 –( (0*1)(0+1)*0*)–> 4

Step 3:

Repeat step 2 and stop when there is one final and one initial state left.

DFA1

DFA1

  1. first, we apply the minimisation algorithm
  2. we add the new final and initial states
  3. we eliminate the state 8
    1. as it has no outgoing transitions, we can just remove it
  4. we eliminate the state 0,1
    1. in:
      • init –(ε)–> 0,1
    2. out:
      • 0,1 –(ε)–> fin
      • 0,1 –(1)–> 2
    3. loop:
      • 0,1 –(0)–> 0,1
    4. we add:
      • init –(0*)–> fin
      • init –(0*1)–> 2
  5. we eliminate the state 2
    1. in:
      • init –(0*1)–> 2
      • 3 –(1)–> 2
    2. out:
      • 2 –(ε)–> fin
      • 2 –(0)–> 3
      • 2 –(1)–> 4
    3. we add:
      • init –(0*1)–> fin
      • init –(0*10)–> 3
      • init –(0*11)–> 4
      • 3 –(1)–> fin
      • 3 –(10)–> 3
      • 3 –(11)–> 4
  6. we eliminate the state 4
    1. in:
      • init –(0*11)–> 4
      • 3 –(11)–> 4
      • 6,7 –(1)–> 4
    2. out:
      • 4 –(0)–> 6,7
    3. we add:
      • init –(0*110)–> 6,7
      • 3 –(110)–> 6,7
      • 6,7 –(10)–> 6,7
  7. we eliminate the state 6,7
    1. in:
      • init –(0*110)–> 6,7
      • 3 –(110)–> 6,7
      • 5 –(1)–> 6,7
    2. out:
      • 6,7 –(0)–> 5
      • 6,7 –(ε)–> fin
    3. loop:
      • 6,7 –(10)–> 6,7
    4. we add:
      • init –(0*110(10)*0)–> 5
      • init –(0*110(10)*)–> fin
      • 3 –(110(10)*0)–> 5
      • 3 –(110(10)*)–> fin
      • 5 –(1(10)*0)–> 5
      • 5 –(1(10)*)–> fin
  8. we eliminate the state 3
    1. in:
      • init –(0*10)–> 3
    2. out:
      • 3 –(ε|1)–> fin
      • 3 –(0|110(10)*0)–> 5
    3. loop:
      • 3 –(10)–> 3
    4. we add:
      • init –(0*10(10)*(ε|1) )–> fin
      • init –(0*10(10)*(0|110(10)*0) )–>5
  9. we eliminate the state 5
    1. in:
      • init –(0*110(10)*0|0*10(10)*(0|110(10)*0) )–> 5
    2. out:
      • 5 –(1(10)*)–> fin
    3. loop:
      • 5 –(1(10)*0)–> 5
    4. we add:
      • init –( (0*110(10)*0|0*10(10)*(0|110(10)*0) ) (1(10)*0)*1(10)*)–> fin
  10. the final regex is 0*|0*1|0*110(10)*|0*10(10)*(ε|1|110(10)*)|(0*110(10)*0|0*10(10)*(0|110(10)*0))(1(10)*0)*1(10)*

6.2. Brzozowsky's algebraic method

Janusz Brzozowski worked out a very elegant (and more computationally efficient) way to convert Dfa's to Regexes. It relies on an observation called Arden's Lemma:

Arden's Lemma

Proposition (Arden's Lemma). Let $ X, A$ and $ B$ be languages, such that $ X = A\cdot X \cup B$ . Then $ X = A^*B$ .

  • Note that we can apply Arden's lemma in various settings, for instance, let $ e_A, e_B$ be regexes such that $ X = L(e_A)\cdot X \cup L(e_B)$ . Then $ X = L(e_A^*e_B)$ .

Dfa to regex conversion

For each state $ q$ , build an equation of the form: $ q = c_1 q_1 \cup c_2 q_2 \ldots c_n q_n$ , such that: $ \delta(q,c_i) = q_i$ . Here $ c_i\in\Sigma$ , thus $ q_i$ are the $ c_i$ -successors of $ q$ . Additionally, if $ q$ is a final state, add an $ \epsilon$ : $ q = c_1 q_1 \cup c_2 q_2 \ldots c_n q_n \cup \epsilon$ .

Example
Consider the Dfa from the left figure. The equations that we get are:
$ q_1 = A q_1 \cup B q_2$
$ q_2 = (A\cup B)q_2 \cup \epsilon$

Equations

Equations are expressions containing regexes together with state variables (like $ q_1$ ). Which are the unknowns. An equation of the form $ q = e\cdot q'$ signifies the fact that $ L(A_q) = L(e) L(A_{q'})$ , where $ L(A_q), L(A_{q'})$ are the languages accepted by our Dfa, starting from states $ q$ and $ q'$ , respectively.

Reducing the system of equations

We can choose any equation except that corresponding to the initial state, and eliminate it, by exploiting Arden's Lemma:

  • the solution to any equation of the form $ q = e\cdot q \cup e'$ is $ q = e^*e'$ .

Example

Going back to the previous system of equations, we can find the solution to $ q_2$ which is: $ (A \cup B)^*$ . Next, we can replace the solution to $ q_2$ in $ q_1$ which yields:

  • $ q_1 = A q_1 \cup B(A\cup B)^*$ .
  • We apply Arden's Lemma one more time and yield: $ q_1 = A^*B(A \cup B)^*$ .

Another example

The initial set of equations is:
$ q_1 = A q_1 \cup B q_2$
$ q_2 = A q_2 \cup B q_1 \cup \epsilon$
  • We reduce the system by removing the second equation:
  • $ q_2 = A^*(Bq_1\cup \epsilon)$
  • We replace $ q_2$ into the first equation:
  • $ q_1 = A q_1 \cup BA^*(B q_1 \cup \epsilon) = (A \cup BA^*B^*) q_1 \cup BA^*$
  • and we apply Arden's lemma again: $ q_1 = (A \cup BA^*B)^*BA^*$

Exercise

6.2.1. Apply Brzozowsky's method to find a regex for the following DFA:

Solution

Solution

q0 = ε | Aq1 | Bq2

q1 = ε | Aq1 | Bq2

q2 = Bq2 | Aq0

⇒ q2 = B*Aq0


Replace q2:

q0 = ε | Aq1 | BB*Aq0

q1 = ε | Aq1 | BB*Aq0

⇒ q1 = A*(ε | BB*Aq0)


Replace q1:

q0 = ε | AA*(ε | BB*Aq0) | BB*Aq0

⇒ q0 = ε | A+ | (A+B+A | B+A)q0

⇒ q0 = A* | (A+B+A | B+A)*q0

⇒ q0 = (A+B+A | B+A)*A*


From q1 = A*(ε | BB*Aq0)

⇒ q1 = A*(ε | BB*A(A+B+A | B+A)*A*)

⇒ q1 = A* | A*BB*A(A+B+A | B+A)*A*


From q2 = B*Aq0

⇒ q2 = B*A(A+B+A | B+A)*A*