This is an old revision of the document!


10. Context-Free Languages & Lexers

For each context-free grammar G:

  1. describe L(G)
  2. algoritmically construct a PDA that accepts the same language
  3. run the PDA on the given inputs
  4. is the grammar ambiguous? If yes, write a non ambiguous grammar that generates the same language

10.1.1 input: aaaabb
$ S \leftarrow aS | aSb | \epsilon $

Solution

Solution

The start symbol of the PDA is S.
The PDA will only have one state q and it will accept via empty stack.
For each nonterminal/rule $ A \leftarrow \gamma $ add a transition q —$(\epsilon, A/ \gamma)$–➤ q and for each terminal c add q —$(c, c/ \epsilon)$–➤ q

Thus, our PDA has the following transitions looping on state q:

  • $ \epsilon, S/aS $
  • $ \epsilon, S/aSb $
  • $ \epsilon, S/\epsilon $
  • $ a, a/\epsilon $
  • $ b, b/\epsilon $

Input: aaabb
(aaabb, q, S) ⇒ (aaabb, q, aSb) ⇒ (aabb, q, Sb) ⇒ (aabb, q, aSbb) ⇒ (abb, q, Sbb) ⇒ (abb, q, aSbb) ⇒ (bb, q, Sbb) ⇒ (bb, q, bb) ⇒ (b, q, b) ⇒ ($\epsilon$, q, $\epsilon$)

Is the grammar ambiguuous? yes, because there exist 2 different left-derivations for word aaabb
S ⇒ aSb ⇒ aaSbb ⇒ aaaSbb ⇒ aaabb
S ⇒ aS ⇒ aaSb ⇒ aaaSbb ⇒ aaabb

The accepted language is $ L(G) = \{a^{m}b^{n} | m \ge n \ge 0\} $

Repaired grammar:
$ S \leftarrow aS | A \\ A \leftarrow aAb | \epsilon $

10.1.2 input: abaaaaaa
$ S \leftarrow aAA \\ A \leftarrow aS | bS | a $

10.1.3 input: aaabbbbbccc
$ S \leftarrow ABC \\ A \leftarrow aA | \epsilon \\ B \leftarrow bbB | b \\ C \leftarrow cC | c $

Solution

Solution

The PDA has the following transitions looping on state q:

  • $ \epsilon, S/ABC $
  • $ \epsilon, A/aA $
  • $ \epsilon, A/\epsilon $
  • $ \epsilon, B/bbB $
  • $ \epsilon, B/b $
  • $ \epsilon, C/cC $
  • $ \epsilon, C/c $
  • $ a, a/\epsilon $
  • $ b, b/\epsilon $
  • $ c, c/\epsilon $

Input: aaabbbbbccc
(aaabbbbbccc, q, S) ⇒ (aaabbbbbccc, q, ABC) ⇒ (aaabbbbbccc, q, aABC) ⇒ (aabbbbbccc, q, ABC) ⇒ ⇒ (aabbbbbccc, q, aABC) ⇒ (abbbbbccc, q, ABC) ⇒ (aabbbbbccc, q, ABC) ⇒ (aabbbbbccc, q, aABC) ⇒ ⇒ (abbbbbccc, q, ABC) ⇒ ⇒ (abbbbbccc, q, aABC) ⇒ (bbbbbccc, q, ABC) ⇒ (bbbbbccc, q, BC) ⇒ (bbbbbccc, q, bbBC) ⇒ (bbbbccc, q, bBC) ⇒ (bbbccc, q, BC) ⇒ (bbbccc, q, bbBC) ⇒ (bbccc, q, bBC) ⇒ (bccc, q, BC) ⇒ (bccc, q, bC) ⇒ (ccc, q, C) ⇒ (ccc, q, cC) ⇒ (cc, q, C) ⇒ (cc, q, cC) ⇒ (c, q, C) ⇒ (c, q, c) ⇒ ($\epsilon$, q, $\epsilon$)

Is the grammar ambiguuous? no

The accepted language is $ L(G) = \{a^{m}b^{2n + 1}c^{p+1} | m,n,p \ge 0\} $

Given the following specs, construct the lexer DFA as presented in Lecture 14:

  • PAIRS: $ (10 | 01)* $
  • ONES: $ 1+ $
  • NO_CONSEC_ONE: $ (1 | \epsilon)(01 | 0)* $

Separate the following input strings into lexemes:

  • 010101
  • 1010101011
  • 01110101001
  • 01010111111001010
  • 1101101001111100001010011001