Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
lfa:nfa [2017/10/11 12:52] pdmatei created |
lfa:nfa [2020/10/19 15:38] (current) pdmatei |
||
---|---|---|---|
Line 3: | Line 3: | ||
==== Motivation ==== | ==== Motivation ==== | ||
- | In the previous lecture we have investigated the **semantics** of regular expressions and saw that we can determine the language accepted by, e.g. $math[(A\cup B)(a\cup b)*(0 \cup 1)*]. However, given a regular expression $math[e]. we are missing a **computational means** for determining if a given word $math[w] is a member of $math[L(e)], and this is precisely the task of the **lexical stage**. | + | In the previous lecture we have investigated the **semantics** of regular expressions and saw that how we can determine the language accepted by, e.g. $math[(A\cup B)(a\cup b)*(0 \cup 1)*]. However, it is not straightforward to **compute** whether a given word $math[w] is a member of $math[L(e)] and this is precisely the task of the **lexical stage**. |
- | In more formal terms, we have a //generator// for languages, but we lack a means for //accepting// (the words for) languages. | + | In more formal terms, we have a //generator// - a means to construct a language from a regular expression, but we lack a means for //accepting// (words of) languages. |
- | We shall informally illustrate an algorithm for verifying the membership $math[w \in L((A\cup B)(a\cup b)*(0 \cup 1)*)]. | ||
- | * **input:** a word ''w=c1c2 ... cn''. | ||
- | * define a //set of integers// ''s'' . Set ''s={0}'' | ||
- | * for each ''ci'' in ''w'': | ||
- | * if ''s=={0}'' and ''ci==A'' or ''ci==B'' then ''s=={1,2}'' | ||
- | * if ''1''$math[\in]''s'' and ''ci==a'' or ''ci==b'' then ''s=={1}'' | ||
- | * if ''1''$math[\in]''s'' and ''ci==0'' or ''ci==1'' then ''s=={2}'' | ||
- | * if ''2''$math[\in]''s'' and ''ci==0'' or ''ci==1'' then ''s=={2}'' | ||
- | * otherwise return ''false'' | ||
- | * if ''2''$math[\in]''s'' or ''1''$math[\in]''s'' then return ''true'' | ||
- | * otherwise return ''false'' | ||
- | |||
- | The idea underlying the algorithm is the //state variable// ''s''. | ||
- | * Initially, ''s={0}'', which means that we are at the beginning of the word. If this is so, the first symbol must be ''A'' or ''B'', otherwise the word is not accepted by the regexp. | ||
- | * After the correct first-symbol was read, we might expect a sequence of ''a''s and ''b''s or a sequence of ''0''s and ''1''s. We do not know that in advance, hence the state variable is ''{1,2}'', modelling this incomplete knowledge. | ||
- | * If ''1'' is a possible current state and we have read ''a'' or ''b'' then the current state is surely ''1''. As long as this is so, we continue to process symbols. | ||
- | * If ''2'' is a possible current state and we have read ''0'' or ''1'' then the current state is surely ''2''. As long as this is so, we continue to process symbols. | ||
- | * If the end of the word has been found while on state ''2'', we stop and report ''true''. In any other situation, we report ''false''. | ||
==== Nondeterministic automata ==== | ==== Nondeterministic automata ==== | ||
- | The key idea behind the previous algorithm can be generalised to **any** regular expression. In order to do that, we require the concept of **nondeterministic finite automaton** (NFA). We will soon discover some similarities between NFAs and the previous algorithm. | + | The key idea behind the previous algorithm can be generalised to **any** regular expression, and its associated code, written in the same style, yields a similar diagram. |
+ | |||
+ | In practice, it is the diagram, i.e. the **nondeterministic finite automaton** (NFA), which helps us generate the code. | ||
$def[NFA] | $def[NFA] | ||
Line 159: | Line 143: | ||
{{:lfa:slide4.jpg|}} | {{:lfa:slide4.jpg|}} | ||
- | |||
- | |||