Differences

This shows you the differences between two versions of the page.

--- lfa:nfa [2018/10/03 11:24]
pdmatei
+++ lfa:nfa [2020/10/19 15:38] (current)
pdmatei
@@ Line 7: / Line 7: @@
 In more formal terms, we have a //generator// - a means to construct a language from a regular expression, but we lack a means for //accepting// (words of) languages.
-We shall informally illustrate an algorithm for verifying the membership $math[w \in L((A\cup B)(a\cup b)*(0 \cup 1)*)], in Haskell:
-<code haskell>
-check ('A':xs) = check1 xs
-check ('B':xs) = check1 xs
-check _ = False
-check1 ('a':xs) = check1 xs
-check1 ('b':xs) = check1 xs
-check1 ('0':xs) = check2 xs
-check1 ('1':xs) = check2 xs
-check1 [] = True
-check1 _ = False
-check2 ('0':xs) = check2 xs
-check2 ('1':xs) = check2 xs
-check2 [] = True
-check2 _ = False
-</code>
-The algorithm proceeds in **three stages**:
-  * in the first stage, we check if ''A'' or ''B'' are encountered, otherwise we move on to the second stage;
-  * in the second stage, we check if ''a'', ''b'', ''0'' or ''1'' are encountered; if ''a'' or ''b'' are found, we continue inspection in the second stage; if ''0'' or ''1'' are found, we continue inspection in the third stage; finally, if the string terminates, we report true;
-  * in the third stage we search for binary digits in a similar way;
-The same strategy can be written in a more elegant way as:
-<code haskell>
-check w = chk w++"!" [0]
-   where chk (x:xs) set =
-	| (x 'elem' ['A', 'B']) && (0 'elem' set) = chk xs [1,2,3]
-	| (x 'elem' ['a', 'b']) && (1 'elem' set) = chk xs [1,2,3]
-	| (x 'elem' ['0', '1']) && (2 'elem' set) = chk xs [2,3]
-	| (x == '!') && (3 'elem' set) = True
-	| otherwise = False
-</code>
-Here, we have introduced the symbol ''!'' to mark the string termination, and thus make the whole code nicer to write. We have also made the //stage idea// explicit. The procedure ''chk'' maintains a set of //stages// or //states//:
-  * $math[0\in set] indicates that we are in the initial stage, where we are looking for ''A'' or ''B''
-  * $math[1\in set] indicates that we have read a sequence of alphabetic symbols: ''a''s, ''b''s may follow
-  * $math[2\in set] indicates that the sequence of alphabetic symbols has ended; only ''0''s or ''1''s may follow;
-  * $math[3\in set] indicates that the string may also terminate at any time - ''3'' is an //end-stage//.
-We start in the initial stage. Whenever a symbol is read, the stage, i.e. the set of possible lookups is updated: for instance, when ''0'' or ''1'' are read, only the second and third situations are possible.
-The idea behind our code could be expressed as the following diagram:
-{{:lfa:example.png|}}
-where
-  * each node is a **state**, which indicates what is the current stage in the recognition of the input word;
-  * each arrow is a **transition** which takes the recognition process from one stage to another;
-  * here, $math[Q_0] is the initial state, $math[Q'] is the state from which any lower-case alphanumeric symbol in the alphabet may follow, and $math[Q''] is the state from which only numerics are accepted.
-The string can terminate successfully in both $math[Q] and $math[Q'], which is shown via double circles.
 ==== Nondeterministic automata ====
@@ Line 195: / Line 143: @@
 {{:lfa:slide4.jpg|}}
-From the proof, a naive algorithm can be easily implemented. We illustrate it in Haskell:
-<code haskell>
-data RegExp = EmptyString |
-              Atom Char |
-              RegExp :| RegExp |
-              RegExp :. RegExp |
-              Kleene RegExp deriving Show
-data NFA = NFA {delta :: [(Int,Char,Int)], fin :: [Int]} deriving Show
-</code>
-We begin with a list-based representation of the transition function $math[\delta]. We assume the symbol ''e'' is reserved for the empty string;
-<code haskell>
--- the strategy is to increment by i, each state
-relabel :: Int -> NFA -> NFA
-relabel i (NFA delta fin) = NFA (map (\(s,c,s')->(s+i,c,s'+i)) delta) (map (+i) fin)
-</code>
-Since we have chosen to represent states as integers, we use a re-labelling function to ensure uniqueness. Re-labelling relies on state increment. For instance, by calling ''relabel (f1+1) n'', we ensure that the NFA ''n'' will have the initial state equal to ''f1+1''. Note that ''f1'' is a final state in our code, which guarantees uniqueness.
-<code haskell>
-toNFA EmptyString = NFA [(0,'e',1)] [1]
-toNFA (Atom c) = NFA [(0,c,1)] [1]
-toNFA (e :. e') = let NFA delta1 [f1] = toNFA e
-                      NFA delta2 [f2] = relabel (f1+1) (toNFA e')
-                  in NFA (delta1++delta2++[(f1,'e',f1+1)]) [f2]
-toNFA (e :| e') = let NFA delta1 [f1] = relabel 1 (toNFA e)
-                      NFA delta2 [f2] = relabel (f1+1) (toNFA e')
-                  in NFA (delta1 ++ delta2 ++[(0,'e',1),
-                                              (0,'e',f1+1),
-                                              (f1,'e',f2+1),
-                                              (f2,'e',f2+1)]) [f2+1]
-toNFA (Kleene e) = let NFA delta [f] = toNFA e in NFA (delta++[(0,'e',f),(f,'e',0)]) [f]
-</code>
-Apart from relabelling, the code follows exactly the steps from the proof.