====== 6. Dfa to Regex conversions ======

==== 6.1. State elimination ====

Consider the following DFAs:

^ **DFA1** ^ **DFA2** ^
|{{ :lfa:screenshot_2021-11-04_at_15.33.10.png?400 |}}| {{ :lfa:2022:lfa2022_lab5_ex2_4_cerinta.png?300 |}} |


Convert the given DFAs to a Regex (using the state-elimination strategy). 
Hint: is it easier to apply conversion on another DFA?


<hidden On what DFA should the algorithm be applied?>
It is recommended to apply the State elimination algorithm on the minimal DFA.
</hidden>


<hidden DFA2>

{{ :lfa:2022:lfa2022_lab6_mindfa2_1.png?300 |}}

**Step 1:**

If there are multiple final states, make them non-final, create a new final state and add ε-transitions to it.

If a final state has any transitions going out of it, create a new final state and add ε-transition.

If the initial state has any transitions going into it, create a new initial state and add ε-transition.

{{ :lfa:2022:lfa2022_lab6_mindfa2_2.png?300 |}}

** Step 2:**

Pick a state that is not initial and not final. For each way to reach its direct successors from its direct predecessors, add a new transition. Then remove the state.

  - Pick **state 1**:
    - in: 
      * 0 --(ε)--> 1
      * 2 --(0)--> 1
    - out:
      * 1 --(ε)--> 4
      * 1 --(1)--> 2
    - loop: 
      * 1 --(0)--> 1
    - Add new transitions:
      * 2 --(00*)--> 4
      * 2 --(00*1)--> 2
      * 0 --(0*)--> 4
      * 0 --(0*1)--> 2
    * {{ :lfa:2022:lfa2022_lab6_mindfa2_3.png?300 |}}
  - Pick **state 3**:
    - Because state 3 has no direct successor, it can simply be removed without adding new transitions.
  - Pick **state 2**:
    - in:
      * 0 --(0*1)--> 2
    - out:
      * 2 --(ε U 00*)--> 4
    - loop:
      * 2 --(00*1)--> 4
    - Add new transitions:
      * 0 --(0*1(00*1)*(ε U 00*))--> 4
      * This transition can be written more simply: 0 --( (0*1)(0+1)*0*)--> 4
    * {{ :lfa:2022:lfa2022_lab6_mindfa2_4.png?300 |}}

** Step 3:**

Repeat step 2 and stop when there is one final and one initial state left.
</hidden>

<hidden DFA1>

  - first, we apply the minimisation algorithm
    * {{ :lfa:2022:lab6_1_dfa1_1.png?500 |}}
  - we add the new final and initial states
    * {{ :lfa:2022:lab6_1_dfa1_2.png?500 |}}
  - we eliminate the state 8
    - as it has no outgoing transitions, we can just remove it
    * {{ :lfa:2022:lab6_1_dfa1_3.png?500 |}}
  - we eliminate the state 0,1
    - in:
      * init --(ε)--> 0,1
    - out:
      * 0,1 --(ε)--> fin
      * 0,1 --(1)--> 2
    - loop:
      * 0,1 --(0)--> 0,1
    - we add:
      * init --(0*)--> fin
      * init --(0*1)--> 2
    * {{ :lfa:2022:lab6_1_dfa1_4.png?500 |}}
  - we eliminate the state 2
    - in:
      * init --(0*1)--> 2
      * 3 --(1)--> 2
    - out:
      * 2 --(ε)--> fin
      * 2 --(0)--> 3
      * 2 --(1)--> 4
    - we add:
      * init --(0*1)--> fin
      * init --(0*10)--> 3
      * init --(0*11)--> 4
      * 3 --(1)--> fin
      * 3 --(10)--> 3
      * 3 --(11)--> 4
    * {{ :lfa:2022:lab6_1_dfa1_5.png?500 |}}
  - we eliminate the state 4
    - in:
      * init --(0*11)--> 4
      * 3 --(11)--> 4
      * 6,7 --(1)--> 4
    - out:
      * 4 --(0)--> 6,7
    - we add:
      * init --(0*110)--> 6,7
      * 3 --(110)--> 6,7
      * 6,7 --(10)--> 6,7
    * {{ :lfa:2022:lab6_1_dfa1_6.png?500 |}}
  - we eliminate the state 6,7
    - in:
      * init --(0*110)--> 6,7
      * 3 --(110)--> 6,7
      * 5 --(1)--> 6,7
    - out:
      * 6,7 --(0)--> 5
      * 6,7 --(ε)--> fin
    - loop:
      * 6,7 --(10)--> 6,7
    - we add:
      * init --(0*110(10)*0)--> 5
      * init --(0*110(10)*)--> fin
      * 3 --(110(10)*0)--> 5
      * 3 --(110(10)*)--> fin
      * 5 --(1(10)*0)--> 5
      * 5 --(1(10)*)--> fin
    * {{ :lfa:2022:lab6_1_dfa1_7.png?500 |}}
  - we eliminate the state 3
    - in:
      * init --(0*10)--> 3
    - out:
      * 3 --(ε|1)--> fin
      * 3 --(0|110(10)*0)--> 5
    - loop:
      * 3 --(10)--> 3
    - we add:
      * init --(0*10(10)*(ε|1) )--> fin
      * init --(0*10(10)*(0|110(10)*0) )-->5
    * {{ :lfa:2022:lab6_1_dfa1_8.png?500 |}}
  - we eliminate the state 5
    - in:
      * init --(0*110(10)*0|0*10(10)*(0|110(10)*0) )--> 5
    - out:
      * 5 --(1(10)*)--> fin
    - loop:
      * 5 --(1(10)*0)--> 5
    - we add:
      * init --( (0*110(10)*0|0*10(10)*(0|110(10)*0) ) (1(10)*0)*1(10)*)--> fin
  - the final regex is ''
0*|0*1|0*110(10)*|0*10(10)*(ε|1|110(10)*)|(0*110(10)*0|0*10(10)*(0|110(10)*0))(1(10)*0)*1(10)*''

</hidden>


==== 6.2. Brzozowsky's algebraic method ====

[[https://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)|Janusz Brzozowski]] worked out a very elegant (and more computationally efficient) way to convert Dfa's to Regexes. It relies on an observation called **Arden's Lemma**:

=== Arden's Lemma ===

**Proposition (Arden's Lemma).** Let $math[X, A] and $math[B] be languages, such that $math[X = A\cdot X \cup B]. Then $math[X = A^*B].

  * Note that we can apply Arden's lemma in various settings, for instance, let $math[e_A, e_B] be regexes such that $math[X = L(e_A)\cdot X \cup L(e_B)]. Then $math[X = L(e_A^*e_B)].

=== Dfa to regex conversion ===

For each state $math[q], build an equation of the form: $math[q = c_1 q_1 \cup c_2 q_2 \ldots c_n q_n], such that: $math[\delta(q,c_i) = q_i]. Here $math[c_i\in\Sigma], thus $math[q_i] are the $math[c_i]-successors of $math[q]. Additionally, if $math[q] is a final state, add an $math[\epsilon]: $math[q = c_1 q_1 \cup c_2 q_2 \ldots c_n q_n \cup \epsilon].

^ ^ ^
| **Example** | {{ :lfa:2022:screenshot_2022-11-09_at_09.35.36.png?200 |}} |
| Consider the Dfa from the left figure. The equations that we get are: | ::: |
| $math[q_1 = A q_1 \cup B q_2] | ::: |
| $math[q_2 = (A\cup B)q_2 \cup \epsilon] | ::: |

=== Equations ===
Equations are expressions containing **regexes** together with **state variables** (like $math[q_1]). Which are the unknowns. An equation of the form $math[q = e\cdot q'] signifies the fact that $math[L(A_q) = L(e) L(A_{q'})], where $math[L(A_q), L(A_{q'})] are the languages accepted by our Dfa, **starting from states $math[q] and $math[q'], respectively**.

=== Reducing the system of equations ===

We can choose **any** equation **except** that corresponding to the initial state, and eliminate it, by exploiting **Arden's Lemma**:
  * the solution to any equation of the form $math[q = e\cdot q \cup e'] is $math[q = e^*e'].

**Example**

Going back to the previous system of equations, we can find the solution to $math[q_2] which is: $math[(A \cup B)^*]. Next, we can replace the solution to $math[q_2] in $math[q_1] which yields:
  * $math[q_1 = A q_1 \cup B(A\cup B)^*].
  * We apply Arden's Lemma one more time and yield: $math[q_1 = A^*B(A \cup B)^*].

=== Another example ===

| The initial set of equations is: | {{ :lfa:2022:screenshot_2022-11-09_at_09.36.24.png?200 |}}|
| $math[q_1 = A q_1 \cup B q_2] | ::: |
| $math[q_2 = A q_2 \cup B q_1 \cup \epsilon] | ::: |

  * We reduce the system by removing the second equation: 
  * $math[q_2 = A^*(Bq_1\cup \epsilon)] 
  * We replace $math[q_2] into the first equation: 
  * $math[q_1 = A q_1 \cup BA^*(B q_1 \cup \epsilon) = (A \cup BA^*) q_1 \cup BA^*]
  * and we apply Arden's lemma again: $math[q_1 = (A \cup BA^*)^*BA^*]

=== Exercise ===

 6.2.1. Apply Brzozowsky's method to find a regex for the following DFA:
^ {{ :lfa:2022:screenshot_2022-11-09_at_15.46.23.png?200 |}} ^


<hidden Solution>

q0 = ε | Aq1 | Bq2

q1 = ε | Aq1 | Bq2

q2 = Bq2 | Aq0

=> q2 = B*Aq0

\\ 

Replace q2:

q0 = ε | Aq1 | BB*Aq0

q1 = ε | Aq1 | BB*Aq0


=> q1 = A*(ε | BB*Aq0)

\\ 

Replace q1:

q0 = ε | AA*(ε | BB*Aq0) | BB*Aq0

=> q0 = ε | A+ | (A+B+A | B+A)q0

=> q0 = A* | (A+B+A | B+A)*q0

=> q0 = (A+B+A | B+A)*A*

\\ 

From q1 = A*(ε | BB*Aq0)

=> q1 = A*(ε | BB*A(A+B+A | B+A)*A*)

=> q1 = A* | A*BB*A(A+B+A | B+A)*A*

\\ 

From q2 = B*Aq0

=> q2 = B*A(A+B+A | B+A)*A*
</hidden>