11. Context-Free Grammars

Write the following grammars in CNF:

11.1.1
$ S \leftarrow ABC \\ A \leftarrow aAb \mid \epsilon \\ B \leftarrow bBc \mid bc \\ C \leftarrow cC \mid c $

Click to display ⇲

Click to hide ⇱

Step 1: Eliminate isolated terminals
$ X \leftarrow \alpha y \beta $ becomes $ X \leftarrow \alpha Y \beta, Y \leftarrow y $

$ S \leftarrow ABC \\ A \leftarrow L_aAL_b | \epsilon \\ B \leftarrow L_bBL_c | L_bL_c \\ C \leftarrow L_cC | L_c \\ L_a \leftarrow a \\ L_b \leftarrow b \\ L_c \leftarrow c $

Step 2: Eliminate long rules
A rule of 3 or more non-terminals becomes a set of several rules with only 2 non-terminals.

$ S \leftarrow AS_1 \\ S_1 \leftarrow BC \\ A \leftarrow L_aA_1 | \epsilon \\ A_1 \leftarrow AL_b \\ B \leftarrow L_bB_1 | L_bL_c \\ B_1 \leftarrow BL_c \\ C \leftarrow L_cC | L_c \\ L_a \leftarrow a \\ L_b \leftarrow b \\ L_c \leftarrow c $

Step 3: Eliminate $ \epsilon $ rules There is one $ \epsilon $ rule and that is for non-terminal A: $ A \leftarrow \epsilon $.
Identify all rules $ X \leftarrow \alpha A \beta $ and add $ X \leftarrow \alpha A \beta | \alpha \beta $. Then remove the $ \epsilon $ rules.

$ S \leftarrow S_1 | AS_1 \\ S_1 \leftarrow BC \\ A \leftarrow L_aA_1 \\ A_1 \leftarrow AL_b | L_b \\ B \leftarrow L_bB_1 | L_bL_c \\ B_1 \leftarrow BL_c \\ C \leftarrow L_cC | L_c \\ L_a \leftarrow a \\ L_b \leftarrow b \\ L_c \leftarrow c $

Step 4: Eliminate unit rules
In order to eliminate each rule $ Y \leftarrow X $, we must:
a. Identify all rules $ X \leftarrow \alpha_1 , \ldots, \alpha_n $, if none exists, simply remove $ Y \leftarrow X $
b. otherwise replace (expand) $ Y \leftarrow X $ by the rule $ Y \leftarrow \alpha_1 | \ldots | \alpha_n $
c. Repeat the process for all other rules

$ S \leftarrow BC | AS_1 \\ S_1 \leftarrow BC \\ A \leftarrow L_aA_1 \\ A_1 \leftarrow AL_b | b \\ B \leftarrow L_bB_1 | L_bL_c \\ B_1 \leftarrow BL_c \\ C \leftarrow L_cC | c \\ L_a \leftarrow a \\ L_b \leftarrow b \\ L_c \leftarrow c $

Step 5: Eeliminate unused, cyclic non-terminals and their rules

11.1.2
$ S \leftarrow 0SA \mid ASB \\ A \leftarrow 0BA \mid 1S \mid 0A \\ B \leftarrow B1 \mid 0B \mid 1 \mid 0 $

Click to display ⇲

Click to hide ⇱

Step 1: Eliminate terminals
$ S \leftarrow ZSA | ASB \\ A \leftarrow ZBA | US | ZA \\ B \leftarrow BU | ZB | U | Z \\U \leftarrow 1 \\ Z \leftarrow 0 \\ $

Step 2: Eliminate long rules
$ S \leftarrow ZS_1 | AS_2 \\ S_1 \leftarrow SA \\ S_2 \leftarrow SB \\ A \leftarrow ZA_1 | US | ZA \\ A_1 \leftarrow BA \\ B \leftarrow BU | ZB | U | Z \\ U \leftarrow 1 \\ Z \leftarrow 0 \\ $

Step 3: Eliminate $ \epsilon $ rules

Step 4: Eliminate unit rules
$ S \leftarrow ZS_1 | AS_2 \\ S_1 \leftarrow SA \\ S_2 \leftarrow SB \\ A \leftarrow ZA_1 | US | ZA \\ A_1 \leftarrow BA \\ B \leftarrow BU | ZB | 1 | 0 \\ U \leftarrow 1 \\ Z \leftarrow 0 \\ $

Step 5: Eliminate unused, cyclic non-terminals and their rules

11.2.1 Give an example of a regular grammar that generates $ L(1^*0^*) $.

Click to display ⇲

Click to hide ⇱

$ S \leftarrow I \\ I \leftarrow 1I | O \\ O \leftarrow 0O | \epsilon $

11.3.1 For each of the following DFAs, algorithmically create a regular grammar that generates the same language.

Click to display ⇲

Click to hide ⇱

For each $ \delta(q, c) = q' $ build $ S_q \leftarrow cS_{q'} $
For each final state q build $ S_q \leftarrow \epsilon $

$ S_0 \leftarrow 0S_0 | 1S_1 | \epsilon \\ S_1 \leftarrow 0S_2 | 1S_0 \\ S_2 \leftarrow 1S_2 | 0S_1 $

Click to display ⇲

Click to hide ⇱

$ S_1 \leftarrow 1S_2 | 0S_1 | \epsilon \\ S_2 \leftarrow 0S_1 | 1S_3 | \epsilon \\ S_3 \leftarrow 0S_3 | 1S_3$

11.4.1 Can a regular Grammar be in Chomsky Normal Form?

Click to display ⇲

Click to hide ⇱

No.

11.4.2 Write an algorithm that verifies whether or not a Regular Grammar generates an infinite language.

Click to display ⇲

Click to hide ⇱

Create a graph $ G=(V,E)$ where the nodes are terminals and non-terminals. For each rule of the form $ X \leftarrow aY$ create a directed edge $ (X,Y)$ . For each rule $ X \leftarrow a$ , also generate an edge $ (X,a)$ .

First, we need to check first if the grammar generates a language different from $ \emptyset$ . So we check if there is a path from the start symbol to some terminal.
Second, we need to check if there are loops. It suffices to check if the graph is a tree. MU: the condition is not sufficient as a grammar could generate a finite language while also having looping rules (if the grammar comes from the dfa transformation, these would be from the sink states of the dfa; see state 3 of the second dfa from 11.3.1)

11.4.3. Prove that any DFA can be converted to a regular grammar.

Click to display ⇲

Click to hide ⇱

See lecture

11.4.4. Is there a decidable algorithm to remove ambiguity from regular grammars?

Click to display ⇲

Click to hide ⇱

A regular grammar $ G$ is ambiguous iff there exists some word $ w$ which is obtained by two different left-most derivation which are necessarily of the form $ S \implies \alpha_1 X_1 \implies \alpha_2 X_2 \ldots \implies \alpha_n$ where each $ X_i$ is a non-terminal and $ \alpha_i$ are characters or $ \epsilon$ . In the corresponding NFA for $ G$ this amounts to having two different configuration chains that start from $ (q_0,w)$ and end up in a final state, eating the word w. We can determinize the NFA. In the resulting DFA (in any DFA for that matter), it is not possible to derive the same word using two different transition sequences (each state and character uniquely determines the next-state). Hence, if we convert this DFA back to a regular grammar, we will get an unambiguous one.

11.4.5. Show the following grammar is ambiguous: $ S \leftarrow aSbS | bSaS | \epsilon $. Write a non-ambiguous equivalent.

Click to display ⇲

Click to hide ⇱

The grammar is indeed ambiguous: $ S \Rightarrow aSbS \Rightarrow abS \Rightarrow abaSbS \Rightarrow ^* abab$ . And also $ S \Rightarrow aSbS \Rightarrow abSaSbS \Rightarrow^* abab$ This grammar generates $ \{ w \in \{0,1\}^* \mid \#_A(w) = \#_B(w) \}$ . See the solution from the previous lab. See also this for more details.
Idea: A = rule that promises that exactly one letter 'a' is extra; B = rule that promises that exactly one letter 'b' is extra
$ S \leftarrow aBS | bAS | \epsilon $
$ A \leftarrow a | bAA $
$ B \leftarrow b | aBB $

11. Context-Free Grammars

11.1. Chomsky Normal Form

11.2. Regular Grammars

11.3. DFA to Regular Grammar

11.4. Short Exercises