Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
lfa:nfa [2018/10/03 11:24]
pdmatei
lfa:nfa [2020/10/19 15:38] (current)
pdmatei
Line 7: Line 7:
 In more formal terms, we have a //​generator//​ - a means to construct a language from a regular expression, but we lack a means for //​accepting//​ (words of) languages. In more formal terms, we have a //​generator//​ - a means to construct a language from a regular expression, but we lack a means for //​accepting//​ (words of) languages.
  
-We shall informally illustrate an algorithm for verifying the membership $math[w \in L((A\cup B)(a\cup b)*(0 \cup 1)*)], in Haskell: 
- 
-<code haskell> 
-check ('​A':​xs) = check1 xs 
-check ('​B':​xs) = check1 xs 
-check _ = False 
- 
-check1 ('​a':​xs) = check1 xs 
-check1 ('​b':​xs) = check1 xs 
-check1 ('​0':​xs) = check2 xs 
-check1 ('​1':​xs) = check2 xs 
-check1 [] = True 
-check1 _ = False 
- 
-check2 ('​0':​xs) = check2 xs 
-check2 ('​1':​xs) = check2 xs 
-check2 [] = True 
-check2 _ = False 
-</​code>​ 
- 
-The algorithm proceeds in **three stages**: 
-  * in the first stage, we check if ''​A''​ or ''​B''​ are encountered,​ otherwise we move on to the second stage; 
-  * in the second stage, we check if ''​a'',​ ''​b'',​ ''​0''​ or ''​1''​ are encountered;​ if ''​a''​ or ''​b''​ are found, we continue inspection in the second stage; if ''​0''​ or ''​1''​ are found, we continue inspection in the third stage; finally, if the string terminates, we report true; 
-  * in the third stage we search for binary digits in a similar way; 
- 
-The same strategy can be written in a more elegant way as: 
-<code haskell> 
-check w = chk w++"​!"​ [0] 
-   where chk (x:xs) set = 
- | (x '​elem'​ ['​A',​ '​B'​]) && (0 '​elem'​ set) = chk xs [1,2,3] 
- | (x '​elem'​ ['​a',​ '​b'​]) && (1 '​elem'​ set) = chk xs [1,2,3] 
- | (x '​elem'​ ['​0',​ '​1'​]) && (2 '​elem'​ set) = chk xs [2,3] 
- | (x == '​!'​) && (3 '​elem'​ set) = True 
- | otherwise = False 
-</​code>​ 
- 
-Here, we have introduced the symbol ''​!''​ to mark the string termination,​ and thus make the whole code nicer to write. We have also made the //stage idea// explicit. The procedure ''​chk''​ maintains a set of //stages// or //states//: 
-  * $math[0\in set] indicates that we are in the initial stage, where we are looking for ''​A''​ or ''​B''​ 
-  * $math[1\in set] indicates that we have read a sequence of alphabetic symbols: ''​a''​s,​ ''​b''​s may follow 
-  * $math[2\in set] indicates that the sequence of alphabetic symbols has ended; only ''​0''​s or ''​1''​s may follow; 
-  * $math[3\in set] indicates that the string may also terminate at any time - ''​3''​ is an //​end-stage//​. 
- 
-We start in the initial stage. Whenever a symbol is read, the stage, i.e. the set of possible lookups is updated: for instance, when ''​0''​ or ''​1''​ are read, only the second and third situations are possible. 
- 
-The idea behind our code could be expressed as the following diagram: 
-{{:​lfa:​example.png|}} 
-where 
-  * each node is a **state**, which indicates what is the current stage in the recognition of the input word; 
-  * each arrow is a **transition** which takes the recognition process from one stage to another; 
-  * here, $math[Q_0] is the initial state, $math[Q'​] is the state from which any lower-case alphanumeric symbol in the alphabet may follow, and $math[Q''​] is the state from which only numerics are accepted. ​ 
- 
-The string can terminate successfully in both $math[Q] and $math[Q'​],​ which is shown via double circles. 
  
 ==== Nondeterministic automata ==== ==== Nondeterministic automata ====
Line 195: Line 143:
  
 {{:​lfa:​slide4.jpg|}} {{:​lfa:​slide4.jpg|}}
- 
-From the proof, a naive algorithm can be easily implemented. We illustrate it in Haskell: 
-<code haskell> 
-data RegExp = EmptyString |  
-              Atom Char |  
-              RegExp :| RegExp |  
-              RegExp :. RegExp |  
-              Kleene RegExp deriving Show 
- 
-data NFA = NFA {delta :: [(Int,​Char,​Int)],​ fin :: [Int]} deriving Show 
-</​code>​ 
-We begin with a list-based representation of the transition function $math[\delta]. We assume the symbol ''​e''​ is reserved for the empty string; 
- 
-<code haskell> 
--- the strategy is to increment by i, each state 
-relabel :: Int -> NFA -> NFA 
-relabel i (NFA delta fin) = NFA (map (\(s,​c,​s'​)->​(s+i,​c,​s'​+i)) delta) (map (+i) fin) 
-</​code>​ 
- 
-Since we have chosen to represent states as integers, we use a re-labelling function to ensure uniqueness. Re-labelling relies on state increment. For instance, by calling ''​relabel (f1+1) n'',​ we ensure that the NFA ''​n''​ will have the initial state equal to ''​f1+1''​. Note that ''​f1''​ is a final state in our code, which guarantees uniqueness. 
- 
- 
-<code haskell> 
-toNFA EmptyString = NFA [(0,'​e',​1)] [1] 
-toNFA (Atom c) = NFA [(0,c,1)] [1] 
-toNFA (e :. e') = let NFA delta1 [f1] = toNFA e 
-                      NFA delta2 [f2] = relabel (f1+1) (toNFA e') 
-                  in NFA (delta1++delta2++[(f1,'​e',​f1+1)]) [f2] 
-toNFA (e :| e') = let NFA delta1 [f1] = relabel 1 (toNFA e) 
-                      NFA delta2 [f2] = relabel (f1+1) (toNFA e') 
-                  in NFA (delta1 ++ delta2 ++[(0,'​e',​1),​ 
-                                              (0,'​e',​f1+1),​ 
-                                              (f1,'​e',​f2+1),​ 
-                                              (f2,'​e',​f2+1)]) [f2+1] 
-toNFA (Kleene e) = let NFA delta [f] = toNFA e in NFA (delta++[(0,'​e',​f),​(f,'​e',​0)]) [f] 
-</​code>​ 
- 
-Apart from relabelling,​ the code follows exactly the steps from the proof. 
- 
-