CFG - PDA equivalence

We prove that languages generated by Context-Free Grammar coincide with those accepted by PDAs. The proof is done in two steps.

Given a Context-Free grammar $ G=(V,\Sigma,R,S)$ , we build a PDA $ M=(K,\Sigma,\Gamma,\Delta,q_0,Z_0,F)$ which accepts precisely $ L(G)$ . The main idea is:

  • for each derivation $ \alpha A \beta \Rightarrow_G \alpha \gamma \beta$ in $ G$ , we will execute one transition in $ M$ .

Consider the following example:

$ S\rightarrow AB$

$ A\rightarrow aB \mid bA$

$ B\rightarrow bB\mid \epsilon$

As well as the derivation:

$ S\Rightarrow AB \Rightarrow bAB \Rightarrow baBB \Rightarrow babBB \Rightarrow babbBB \Rightarrow babbB \Rightarrow babb$

which illustrates $ babb\in L(G)$ for our example. For this derivation, we should have the configuration sequence:

$ (?,babb,Z_0) \vdash (?,babb,SZ_0) \vdash (?,babb,ABZ_0) \vdash (?,babb,bABZ_0) \vdash$

$ (?,abb,ABZ_0) \vdash (?,abb,aBBZ_0)\vdash(?,bb,BBZ_0)\vdash(?,bb,bBBZ_0) \vdash$

$ (?,b,BBZ_0) \vdash (?,b,bBBZ_0) \vdash (?,\epsilon,BBZ_0)\vdash(?,\epsilon,BZ_0) \vdash$

$ (?,\epsilon,Z_0)$

Ideas:

  • each possible derivation in $ G$ (not necessary successful), should correspond to a transition in $ M$
  • we use the stack to hold the non-terminals which are due to be expanded. For this to work, we must consider left-most derivations only
  • whenever the stack contains a terminal symbol coinciding with the input, we pop it;
  • whenever the stack contains a non-terminal, we pop it, and (non-deterministically) push all its possible derivations.

Construction

  • $ K = \{q_0,p\}$
  • $ \Gamma = V\cup\{Z_0\}$
  • build transition $ (q_0,\epsilon,\epsilon,p,S)$ which puts the start symbol $ S$ on the stack;
  • for each production $ A\rightarrow\gamma$ with $ \gamma\in V^*$ , we build transition $ (p,\epsilon,A,p,\gamma)$ , which replaces $ A$ with $ \gamma$ on the stack, without consuming the input
  • for each symbol $ a\in\Sigma$ , build transition $ (p,a,a,p,\epsilon)$ , which pops a symbol off the stack, once it is read at input;
  • $ F=\{p\}$

Proof

To prove that $ L(M)=L(G)$ , where $ M$ is build from $ G$ following the above rules, we must observe that $ M$ does not simulate all possible derivations, but only those which are left-most. (As shown by our example). The reason is that $ M$ eats symbols as it encounters them, and can only do this from left to right.

Thus, we need to establish:

Proposition ():

Given a CFG $ G$ and a word $ w$ such that $ S\Rightarrow^*_G w$ , then there exists a sequence of derivations where the first non-terminal to the left is always expanded first, which derives the word $ w$ .

In order words, if we can derive $ w$ in $ G$ , then we can also derive it via left-most derivations only. We omit the proof for this proposition.

We first show that $ L(G) \subseteq L(M)$ .

Proposition:

If $ S\Rightarrow_G^* \alpha\beta$ , using left-most derivations only, and with $ \alpha\in\Sigma^*$ and $ \beta\in(V\setminus\Sigma)V^*\cup\{\epsilon\}$ then $ (p,\alpha,S)\vdash(p,\epsilon,\beta)$ .

Our inclusion follows for $ \beta=\epsilon$ .

Proof:

The proof is by induction over the length of the derivation.

Basis: zero-length derivation.

$ S\Rightarrow^*S$ in zero steps. Then $ (p,\epsilon,S)\vdash^*_M(p,\epsilon,S)$ , by reflexivity of $ \vdash_M^*$ .

Induction step: Suppose $ S\Rightarrow^* \alpha\beta$ in $ n+1$ steps. Then $ S\Rightarrow^* uv \Rightarrow\alpha\beta$ . Also, $ \alpha$ and $ u$ contain terminal symbols only, while $ \beta$ and $ v$ start with a non-terminal.

Let us look at the last production $ uv\Rightarrow\alpha\beta$ . Since $ v$ must start with a non-terminal, then $ v$ is a word of the form $ Av'$ . Then $ A$ is the first non-terminal, hence, a production $ A\rightarrow\gamma$ must exist in $ G$ . Moreover, we can safely assume that $ \gamma=xBy$ where $ B$ is a non-terminal. (the reasoning is similar if no non-terminal in $ \gamma$ exists. Therefore, our derivation actually has the following structure:

$ uAv' \Rightarrow uxByv'$ where $ \alpha=ux$ and $ \beta=Byv'$ .

Since, $ S\Rightarrow^* uAv'$ in $ n$ steps, by induction hypothesis, $ (p,u,S)\vdash_M(p,\epsilon,Av')$

Let us start from configuration $ (p,\alpha,S) = (p,ux,S)$ . The induction hypothesis entails that we can eat the $ u$ portion of the word: $ (p,ux,S)\vdash^*(p,x,Av')$ . By construction of $ M$ , we can remove $ A$ from the stack without consuming the input: $ (p,x,Av')\vdash^*(p,x,xByv)$ . We have just simulated the production $ A\rightarrow xBy$ . By construction of $ M$ , we can also eat each symbol from x, while removing it from the stack: $ (p,\epsilon,Byv)$ . The word $ Byv$ is actually $ \beta$ (i.e. a word starting with a non-terminal).

The proof is finished.

Next, we show $ L(M) \subseteq L(G)$ via the following proposition:

Proposition:

If $ (q,\alpha,S)\vdash^*_M(q,\epsilon,\beta)$ , where $ \alpha\in\Sigma^*$ and $ \beta\in V^*$ , then $ S\Rightarrow_G^*\alpha\beta$

Notice that this implication is not precisely the converse of the previous one: $ \beta$ need not start with a non-terminal. The proof is similar to the above. We leave it as exercise.

To construct a grammar from a PDA, we need to envision the sequence of transitions of a PDA, as a:

  • sequence of pop-events, while parts of the input are being consumed.
  • a pop-event of symbol $ A$ is a sequence of pushes and pops (which do not affect $ A$ , or the symbols under it), which ultimately ends with the popping of $ A$ .

With this in mind, we shall construct non-terminals in a grammar as triples:

  • $ \langle qXr \rangle$ where $ q,r$ are states of the PDA and $ X$ is a symbol.
  • such a non-terminal models a sequence of transitions where the PDA goes from state $ q$ to state $ r$ , while the pop-event $ X$ occurs (i.e. a sequence of push-pops which do not touch $ X$ occur, and which end up with $ X$ being removed). The idea is that $ \langle qXr \Rightarrow^* w$ iff $ (q,w,X)\vdash^*_M (r,\epsilon,\epsilon)$ . That is, if $ w$ is consumed starting from state $ q$ with $ X$ on the stack and ending up in state $ r$ , then $ w$ can be derived in our grammar from non-terminal $ \langle qXr\rangle$

Construction

We shall require the following conditions on the PDA at hand:

  • it should have a unique final state. Moreover, in this final state, we pop the empty symbol;
  • each transition performs a stack operation of type $ Y_1Y_2$ (e.g. a push) or $ \epsilon$ (a pop):
    • if the PDA performs a more complicated combination of push-pops, we can add intermediate transitions which obey the above rule;
    • if the PDA does not touch the stack, we push a dummy symbol and subsequently pop it;

It is easy to take any PDA and transform it to an equivalent one where the two-above conditions are obeyed.

We construct $ G=(V,\Sigma,R,S)$ as follows:

  • $ V=\{\langle qXr \rangle \mid q,r\in K,X\in\Gamma \}\cup\Sigma$ ; some non-terminals from $ V$ may end up being unused;
  • we build production $ S\rightarrow\langle q_0Z_0p\rangle$ where $ p\in F$ . This non-terminal models the sequence of transitions going from the initial state to the final state, while the empty symbol is popped. This sequence of transitions marks the acceptance of a word.
  • If $ \Delta$ contains $ (q,a,X,r,Y_1Y_2)$ , then we build $ \mid K\mid^2$ productions of the form:
    • $ qXr_2 \rightarrow a\langle rY_1r_1\rangle\langle r_1Y_2r_2\rangle$ for all $ r_1,r_2\in K$
    • in other words, in order to obtain a stack with everything unchanged 'below' $ X$ , starting from $ q$ and ending up in some $ r_2$ , we must eat symbol $ a$ , then from $ r$ we must pop $ Y_1$ , then $ Y_2$ . We do not know what state will be reached after popping $ Y_1$ , so we consider all possible states. The same holds for $ Y_2$ .
  • If $ \Delta$ contains $ (q,a,X,r,\epsilon)$ , then we build the production:
    • $ \langle qXr \rangle \rightarrow a$ where $ a$ could be the empty string;

Example

Consider the following PDA, which accepts $ L=\{0^n1^n\mid n\geq 0\}$ .

Current state Input Stack top Next state Stack op
$ q_0$ $ 0$ $ Z_0$ $ q_0$ $ XZ_0$
$ q_0$ $ 0$ $ X$ $ q_0$ $ XX$
$ q_0$ $ 1$ $ X$ $ q_1$ $ \epsilon$
$ q_1$ $ 1$ $ X$ $ q_1$ $ \epsilon$
$ q_1$ $ \epsilon$ $ Z$ $ q_1$ $ \epsilon$

Note that the final transition was not necessary for accepting $ L$ , but was required by our construction. The final state is $ q_1$ .

We first build production:

$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$ which captures the event that we pop the empty stack-symbol starting from the initial state, and ending in the final state (word-acceptance).

The first transition generates the following template production:

$ \langle q_0 Z_0 r_2 \rangle \rightarrow 0 \langle q_0 X r_1\rangle\langle r_1Z_0 r_2\rangle$ with $ r_1,r_2\in \{q_0,q_1\}$ . We have thus defined four productions.

Similarly, the second transition defines:

$ \langle q_0 X r_2 \rangle \rightarrow 0 \langle q_1 X r_1\rangle\langle r_1X r_2\rangle$ which are another four productions.

The third and fourth transitions yield the productions:

$ \langle q_0 X q_1 \rangle \rightarrow 1$

$ \langle q_1 X q_1 \rangle \rightarrow 1$

And the final transition:

$ \langle q_1 Z_0 q_1 \rangle \rightarrow \epsilon$

The complete set of productions is:

$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$

$ \langle q_0 Z_0 q_0 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0Z_0 q_0\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1Z_0 q_0\rangle$

$ \langle q_0 Z_0 q_1 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0Z_0 q_1\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1Z_0 q_1\rangle$

$ \langle q_0 X q_0 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0X q_0\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1X q_0\rangle$

$ \langle q_0 X q_1 \rangle \rightarrow 1 \mid 0 \langle q_0 X q_0\rangle\langle q_0X q_1\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1X q_1\rangle$

$ \langle q_1 X q_1 \rangle \rightarrow 1$

$ \langle q_1 Z_0 q_1 \rangle \rightarrow \epsilon$

To make sense of this grammar, note that non-terminals $ \langle q_1 X q_0\rangle$ and $ \langle q_1 Z_0 q_0 \rangle$ do not appear in the LHS of a production, hence any derivation that includes rules containing them will get stuck.

We eliminate such rules. The result is:

$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$

$ \langle q_0 Z_0 q_0 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0Z_0 q_0\rangle$

$ \langle q_0 Z_0 q_1 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0Z_0 q_1\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1Z_0 q_1\rangle$

$ \langle q_0 X q_0 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0X q_0\rangle$

$ \langle q_0 X q_1 \rangle \rightarrow 1 \mid 0 \langle q_0 X q_0\rangle\langle q_0X q_1\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1X q_1\rangle$

$ \langle q_1 X q_1 \rangle \rightarrow 1$

$ \langle q_1 Z_0 q_1 \rangle \rightarrow \epsilon$

Next, we observe that the rules having $ \langle q_0 Z_0 q_0 \rangle$ and $ \langle q_0 X q_0\rangle$ continue to generate non-terminals, ad-infinitum. Derivations including them will never produce actual words. For this reason, we ignore such rules:

$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$

$ \langle q_0 Z_0 q_1 \rangle \rightarrow 0 \langle q_0 X q_1\rangle\langle q_1Z_0 q_1\rangle$

$ \langle q_0 X q_1 \rangle \rightarrow 1 \mid 0 \langle q_0 X q_1\rangle\langle q_1X q_1\rangle$

$ \langle q_1 X q_1 \rangle \rightarrow 1$

$ \langle q_1 Z_0 q_1 \rangle \rightarrow \epsilon$

Also, we can remove the final two productions, and replace the occurrence of the l.h.s. non-terminals with $ \epsilon$ (resp. $ 1$ ), which yields:

$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$

$ \langle q_0 Z_0 q_1 \rangle \rightarrow 0 \langle q_0 X q_1\rangle$

$ \langle q_0 X q_1 \rangle \rightarrow 1 \mid 0 \langle q_0 X q_1\rangle 1$

Finally, we can merge the two productions, and write $ A$ instead of $ \langle q_0 X q_1 \rangle$ which produces an easy-to-read grammar:

$ S \rightarrow 0 A$

$ A \rightarrow 1 \mid 0 A 1$