This is an old revision of the document!


5. Regex to NFA to DFA conversions

Thompson's algorithm is used to convert from a Regex to an NFA. Subset construction is used to convert from an NFA to an equivalent DFA.

Thompson's algorithm applies the following rules recursively:

Exercises

5.1. Convert one of the following regular expressions to an NFA using Thompson's algorithm and then to a DFA using subset construction.

  • $ (1 \cup \varepsilon)(0^*1)^*0^* $
  • $ (0\cup 01) \cup 1(10^*)^* \cup \varepsilon $
  • $ (((00 \cup 11)^*1)^*0)^* $

5.2. What would happen if the construction step for $ e^* $ in Thompson's algorithm were defined as follows? What would go wrong? Find a counterexample word for each of them.

5.3.1 What regular expressions can be converted to NFAs without using epsilon-transitions (and without involving any NFA to DFA conversion)? Think about simple cases, like concatenations of symbols, unions of symbols, etc. and try more complex cases. Can this approach be transformed into a recursive algorithm like Thompson's algorithm?

5.3.2. Consider an NFA as an interface that provides the following methods:

  • initState(): State returns the initial state of the NFA
  • endState(): State returns a distinguished state of the NFA
  • toggleEpsilon(): NFA returns the same NFA, but if initState() is a final state, it makes it non-final and vice-versa
  • merge(nfa2: NFA): NFA returns an NFA constructed from this one by replacing all transitions to endState() with transitions to nfa2.initState(), deleting the current endState() and replacing it with nfa2.endState()
  • adjoin(nfa2: NFA): NFA returns an NFA constructed from this one by replacing all transitions from nfa2.initState() with transitions from initState() and all transitions to nfa2.endState() with transitions to endState(), then deleting nfa2.initState() and nfa2.endState(); initState() and endState() remain the same

Using only these operations to create NFAs from other NFAs, try to find a recursive algorithm to convert a Regular Expression to an NFA without epsilon-transitions. Discuss base cases, construction steps and justify them. It may be useful to consider some invariants of the algorithm.