====== 6. Dfa to Regex conversions ======
==== 6.1. State elimination ====
Consider the following DFAs:
^ **DFA1** ^ **DFA2** ^
|{{ :lfa:screenshot_2021-11-04_at_15.33.10.png?400 |}}| {{ :lfa:2022:lfa2022_lab5_ex2_4_cerinta.png?300 |}} |
Convert the given DFAs to a Regex (using the state-elimination strategy).
Hint: is it easier to apply conversion on another DFA?
It is recommended to apply the State elimination algorithm on the minimal DFA.
{{ :lfa:2022:lfa2022_lab6_mindfa2_1.png?300 |}}
**Step 1:**
If there are multiple final states, make them non-final, create a new final state and add ε-transitions to it.
If a final state has any transitions going out of it, create a new final state and add ε-transition.
If the initial state has any transitions going into it, create a new initial state and add ε-transition.
{{ :lfa:2022:lfa2022_lab6_mindfa2_2.png?300 |}}
** Step 2:**
Pick a state that is not initial and not final. For each way to reach its direct successors from its direct predecessors, add a new transition. Then remove the state.
- Pick **state 1**:
- in:
* 0 --(ε)--> 1
* 2 --(0)--> 1
- out:
* 1 --(ε)--> 4
* 1 --(1)--> 2
- loop:
* 1 --(0)--> 1
- Add new transitions:
* 2 --(00*)--> 4
* 2 --(00*1)--> 2
* 0 --(0*)--> 4
* 0 --(0*1)--> 2
* {{ :lfa:2022:lfa2022_lab6_mindfa2_3.png?300 |}}
- Pick **state 3**:
- Because state 3 has no direct successor, it can simply be removed without adding new transitions.
- Pick **state 2**:
- in:
* 0 --(0*1)--> 2
- out:
* 2 --(ε U 00*)--> 4
- loop:
* 2 --(00*1)--> 4
- Add new transitions:
* 0 --(0*1(00*1)*(ε U 00*))--> 4
* This transition can be written more simply: 0 --( (0*1)(0+1)*0*)--> 4
* {{ :lfa:2022:lfa2022_lab6_mindfa2_4.png?300 |}}
** Step 3:**
Repeat step 2 and stop when there is one final and one initial state left.
- first, we apply the minimisation algorithm
* {{ :lfa:2022:lab6_1_dfa1_1.png?500 |}}
- we add the new final and initial states
* {{ :lfa:2022:lab6_1_dfa1_2.png?500 |}}
- we eliminate the state 8
- as it has no outgoing transitions, we can just remove it
* {{ :lfa:2022:lab6_1_dfa1_3.png?500 |}}
- we eliminate the state 0,1
- in:
* init --(ε)--> 0,1
- out:
* 0,1 --(ε)--> fin
* 0,1 --(1)--> 2
- loop:
* 0,1 --(0)--> 0,1
- we add:
* init --(0*)--> fin
* init --(0*1)--> 2
* {{ :lfa:2022:lab6_1_dfa1_4.png?500 |}}
- we eliminate the state 2
- in:
* init --(0*1)--> 2
* 3 --(1)--> 2
- out:
* 2 --(ε)--> fin
* 2 --(0)--> 3
* 2 --(1)--> 4
- we add:
* init --(0*1)--> fin
* init --(0*10)--> 3
* init --(0*11)--> 4
* 3 --(1)--> fin
* 3 --(10)--> 3
* 3 --(11)--> 4
* {{ :lfa:2022:lab6_1_dfa1_5.png?500 |}}
- we eliminate the state 4
- in:
* init --(0*11)--> 4
* 3 --(11)--> 4
* 6,7 --(1)--> 4
- out:
* 4 --(0)--> 6,7
- we add:
* init --(0*110)--> 6,7
* 3 --(110)--> 6,7
* 6,7 --(10)--> 6,7
* {{ :lfa:2022:lab6_1_dfa1_6.png?500 |}}
- we eliminate the state 6,7
- in:
* init --(0*110)--> 6,7
* 3 --(110)--> 6,7
* 5 --(1)--> 6,7
- out:
* 6,7 --(0)--> 5
* 6,7 --(ε)--> fin
- loop:
* 6,7 --(10)--> 6,7
- we add:
* init --(0*110(10)*0)--> 5
* init --(0*110(10)*)--> fin
* 3 --(110(10)*0)--> 5
* 3 --(110(10)*)--> fin
* 5 --(1(10)*0)--> 5
* 5 --(1(10)*)--> fin
* {{ :lfa:2022:lab6_1_dfa1_7.png?500 |}}
- we eliminate the state 3
- in:
* init --(0*10)--> 3
- out:
* 3 --(ε|1)--> fin
* 3 --(0|110(10)*0)--> 5
- loop:
* 3 --(10)--> 3
- we add:
* init --(0*10(10)*(ε|1) )--> fin
* init --(0*10(10)*(0|110(10)*0) )-->5
* {{ :lfa:2022:lab6_1_dfa1_8.png?500 |}}
- we eliminate the state 5
- in:
* init --(0*110(10)*0|0*10(10)*(0|110(10)*0) )--> 5
- out:
* 5 --(1(10)*)--> fin
- loop:
* 5 --(1(10)*0)--> 5
- we add:
* init --( (0*110(10)*0|0*10(10)*(0|110(10)*0) ) (1(10)*0)*1(10)*)--> fin
- the final regex is ''
0*|0*1|0*110(10)*|0*10(10)*(ε|1|110(10)*)|(0*110(10)*0|0*10(10)*(0|110(10)*0))(1(10)*0)*1(10)*''
==== 6.2. Brzozowsky's algebraic method ====
[[https://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)|Janusz Brzozowski]] worked out a very elegant (and more computationally efficient) way to convert Dfa's to Regexes. It relies on an observation called **Arden's Lemma**:
=== Arden's Lemma ===
**Proposition (Arden's Lemma).** Let $math[X, A] and $math[B] be languages, such that $math[X = A\cdot X \cup B]. Then $math[X = A^*B].
* Note that we can apply Arden's lemma in various settings, for instance, let $math[e_A, e_B] be regexes such that $math[X = L(e_A)\cdot X \cup L(e_B)]. Then $math[X = L(e_A^*e_B)].
=== Dfa to regex conversion ===
For each state $math[q], build an equation of the form: $math[q = c_1 q_1 \cup c_2 q_2 \ldots c_n q_n], such that: $math[\delta(q,c_i) = q_i]. Here $math[c_i\in\Sigma], thus $math[q_i] are the $math[c_i]-successors of $math[q]. Additionally, if $math[q] is a final state, add an $math[\epsilon]: $math[q = c_1 q_1 \cup c_2 q_2 \ldots c_n q_n \cup \epsilon].
^ ^ ^
| **Example** | {{ :lfa:2022:screenshot_2022-11-09_at_09.35.36.png?200 |}} |
| Consider the Dfa from the left figure. The equations that we get are: | ::: |
| $math[q_1 = A q_1 \cup B q_2] | ::: |
| $math[q_2 = (A\cup B)q_2 \cup \epsilon] | ::: |
=== Equations ===
Equations are expressions containing **regexes** together with **state variables** (like $math[q_1]). Which are the unknowns. An equation of the form $math[q = e\cdot q'] signifies the fact that $math[L(A_q) = L(e) L(A_{q'})], where $math[L(A_q), L(A_{q'})] are the languages accepted by our Dfa, **starting from states $math[q] and $math[q'], respectively**.
=== Reducing the system of equations ===
We can choose **any** equation **except** that corresponding to the initial state, and eliminate it, by exploiting **Arden's Lemma**:
* the solution to any equation of the form $math[q = e\cdot q \cup e'] is $math[q = e^*e'].
**Example**
Going back to the previous system of equations, we can find the solution to $math[q_2] which is: $math[(A \cup B)^*]. Next, we can replace the solution to $math[q_2] in $math[q_1] which yields:
* $math[q_1 = A q_1 \cup B(A\cup B)^*].
* We apply Arden's Lemma one more time and yield: $math[q_1 = A^*B(A \cup B)^*].
=== Another example ===
| The initial set of equations is: | {{ :lfa:2022:screenshot_2022-11-09_at_09.36.24.png?200 |}}|
| $math[q_1 = A q_1 \cup B q_2] | ::: |
| $math[q_2 = A q_2 \cup B q_1 \cup \epsilon] | ::: |
* We reduce the system by removing the second equation:
* $math[q_2 = A^*(Bq_1\cup \epsilon)]
* We replace $math[q_2] into the first equation:
* $math[q_1 = A q_1 \cup BA^*(B q_1 \cup \epsilon) = (A \cup BA^*) q_1 \cup BA^*]
* and we apply Arden's lemma again: $math[q_1 = (A \cup BA^*)^*BA^*]
=== Exercise ===
6.2.1. Apply Brzozowsky's method to find a regex for the following DFA:
^ {{ :lfa:2022:screenshot_2022-11-09_at_15.46.23.png?200 |}} ^
q0 = ε | Aq1 | Bq2
q1 = ε | Aq1 | Bq2
q2 = Bq2 | Aq0
=> q2 = B*Aq0
\\
Replace q2:
q0 = ε | Aq1 | BB*Aq0
q1 = ε | Aq1 | BB*Aq0
=> q1 = A*(ε | BB*Aq0)
\\
Replace q1:
q0 = ε | AA*(ε | BB*Aq0) | BB*Aq0
=> q0 = ε | A+ | (A+B+A | B+A)q0
=> q0 = A* | (A+B+A | B+A)*q0
=> q0 = (A+B+A | B+A)*A*
\\
From q1 = A*(ε | BB*Aq0)
=> q1 = A*(ε | BB*A(A+B+A | B+A)*A*)
=> q1 = A* | A*BB*A(A+B+A | B+A)*A*
\\
From q2 = B*Aq0
=> q2 = B*A(A+B+A | B+A)*A*