CFG - PDA equivalence
We prove that languages generated by Context-Free Grammar coincide with those accepted by PDAs. The proof is done in two steps.
CFG to PDA
Given a Context-Free grammar $ G=(V,\Sigma,R,S)$ , we build a PDA $ M=(K,\Sigma,\Gamma,\Delta,q_0,Z_0,F)$ which accepts precisely $ L(G)$ . The main idea is:
- for each derivation $ \alpha A \beta \Rightarrow_G \alpha \gamma \beta$ in $ G$ , we will execute one transition in $ M$ .
Consider the following example:
$ S\rightarrow AB$
$ A\rightarrow aB \mid bA$
$ B\rightarrow bB\mid \epsilon$
As well as the derivation:
$ S\Rightarrow AB \Rightarrow bAB \Rightarrow baBB \Rightarrow babBB \Rightarrow babbBB \Rightarrow babbB \Rightarrow babb$
which illustrates $ babb\in L(G)$ for our example. For this derivation, we should have the configuration sequence:
$ (?,babb,Z_0) \vdash (?,babb,SZ_0) \vdash (?,babb,ABZ_0) \vdash (?,babb,bABZ_0) \vdash$
$ (?,abb,ABZ_0) \vdash (?,abb,aBBZ_0)\vdash(?,bb,BBZ_0)\vdash(?,bb,bBBZ_0) \vdash$
$ (?,b,BBZ_0) \vdash (?,b,bBBZ_0) \vdash (?,\epsilon,BBZ_0)\vdash(?,\epsilon,BZ_0) \vdash$
$ (?,\epsilon,Z_0)$
Ideas:
- each possible derivation in $ G$ (not necessary successful), should correspond to a transition in $ M$
- we use the stack to hold the non-terminals which are due to be expanded. For this to work, we must consider left-most derivations only
- whenever the stack contains a terminal symbol coinciding with the input, we pop it;
- whenever the stack contains a non-terminal, we pop it, and (non-deterministically) push all its possible derivations.
Construction
- $ K = \{q_0,p\}$
- $ \Gamma = V\cup\{Z_0\}$
- build transition $ (q_0,\epsilon,\epsilon,p,S)$ which puts the start symbol $ S$ on the stack;
- for each production $ A\rightarrow\gamma$ with $ \gamma\in V^*$ , we build transition $ (p,\epsilon,A,p,\gamma)$ , which replaces $ A$ with $ \gamma$ on the stack, without consuming the input
- for each symbol $ a\in\Sigma$ , build transition $ (p,a,a,p,\epsilon)$ , which pops a symbol off the stack, once it is read at input;
- $ F=\{p\}$
Proof
To prove that $ L(M)=L(G)$ , where $ M$ is build from $ G$ following the above rules, we must observe that $ M$ does not simulate all possible derivations, but only those which are left-most. (As shown by our example). The reason is that $ M$ eats symbols as it encounters them, and can only do this from left to right.
Thus, we need to establish:
Proposition ():
Given a CFG $ G$ and a word $ w$ such that $ S\Rightarrow^*_G w$ , then there exists a sequence of derivations where the first non-terminal to the left is always expanded first, which derives the word $ w$ .
In order words, if we can derive $ w$ in $ G$ , then we can also derive it via left-most derivations only. We omit the proof for this proposition.
We first show that $ L(G) \subseteq L(M)$ .
Proposition:
If $ S\Rightarrow_G^* \alpha\beta$ , using left-most derivations only, and with $ \alpha\in\Sigma^*$ and $ \beta\in(V\setminus\Sigma)V^*\cup\{\epsilon\}$ then $ (p,\alpha,S)\vdash(p,\epsilon,\beta)$ .
Our inclusion follows for $ \beta=\epsilon$ .
Proof:
The proof is by induction over the length of the derivation.
Basis: zero-length derivation.
$ S\Rightarrow^*S$ in zero steps. Then $ (p,\epsilon,S)\vdash^*_M(p,\epsilon,S)$ , by reflexivity of $ \vdash_M^*$ .
Induction step: Suppose $ S\Rightarrow^* \alpha\beta$ in $ n+1$ steps. Then $ S\Rightarrow^* uv \Rightarrow\alpha\beta$ . Also, $ \alpha$ and $ u$ contain terminal symbols only, while $ \beta$ and $ v$ start with a non-terminal.
Let us look at the last production $ uv\Rightarrow\alpha\beta$ . Since $ v$ must start with a non-terminal, then $ v$ is a word of the form $ Av'$ . Then $ A$ is the first non-terminal, hence, a production $ A\rightarrow\gamma$ must exist in $ G$ . Moreover, we can safely assume that $ \gamma=xBy$ where $ B$ is a non-terminal. (the reasoning is similar if no non-terminal in $ \gamma$ exists. Therefore, our derivation actually has the following structure:
$ uAv' \Rightarrow uxByv'$ where $ \alpha=ux$ and $ \beta=Byv'$ .
Since, $ S\Rightarrow^* uAv'$ in $ n$ steps, by induction hypothesis, $ (p,u,S)\vdash_M(p,\epsilon,Av')$
Let us start from configuration $ (p,\alpha,S) = (p,ux,S)$ . The induction hypothesis entails that we can eat the $ u$ portion of the word: $ (p,ux,S)\vdash^*(p,x,Av')$ . By construction of $ M$ , we can remove $ A$ from the stack without consuming the input: $ (p,x,Av')\vdash^*(p,x,xByv)$ . We have just simulated the production $ A\rightarrow xBy$ . By construction of $ M$ , we can also eat each symbol from x, while removing it from the stack: $ (p,\epsilon,Byv)$ . The word $ Byv$ is actually $ \beta$ (i.e. a word starting with a non-terminal).
The proof is finished.
Next, we show $ L(M) \subseteq L(G)$ via the following proposition:
Proposition:
If $ (q,\alpha,S)\vdash^*_M(q,\epsilon,\beta)$ , where $ \alpha\in\Sigma^*$ and $ \beta\in V^*$ , then $ S\Rightarrow_G^*\alpha\beta$
Notice that this implication is not precisely the converse of the previous one: $ \beta$ need not start with a non-terminal. The proof is similar to the above. We leave it as exercise.
PDA to CFG
To construct a grammar from a PDA, we need to envision the sequence of transitions of a PDA, as a:
- sequence of pop-events, while parts of the input are being consumed.
- a pop-event of symbol $ A$ is a sequence of pushes and pops (which do not affect $ A$ , or the symbols under it), which ultimately ends with the popping of $ A$ .
With this in mind, we shall construct non-terminals in a grammar as triples:
- $ \langle qXr \rangle$ where $ q,r$ are states of the PDA and $ X$ is a symbol.
- such a non-terminal models a sequence of transitions where the PDA goes from state $ q$ to state $ r$ , while the pop-event $ X$ occurs (i.e. a sequence of push-pops which do not touch $ X$ occur, and which end up with $ X$ being removed). The idea is that $ \langle qXr \Rightarrow^* w$ iff $ (q,w,X)\vdash^*_M (r,\epsilon,\epsilon)$ . That is, if $ w$ is consumed starting from state $ q$ with $ X$ on the stack and ending up in state $ r$ , then $ w$ can be derived in our grammar from non-terminal $ \langle qXr\rangle$
Construction
We shall require the following conditions on the PDA at hand:
- it should have a unique final state. Moreover, in this final state, we pop the empty symbol;
- each transition performs a stack operation of type $ Y_1Y_2$ (e.g. a push) or $ \epsilon$ (a pop):
- if the PDA performs a more complicated combination of push-pops, we can add intermediate transitions which obey the above rule;
- if the PDA does not touch the stack, we push a dummy symbol and subsequently pop it;
It is easy to take any PDA and transform it to an equivalent one where the two-above conditions are obeyed.
We construct $ G=(V,\Sigma,R,S)$ as follows:
- $ V=\{\langle qXr \rangle \mid q,r\in K,X\in\Gamma \}\cup\Sigma$ ; some non-terminals from $ V$ may end up being unused;
- we build production $ S\rightarrow\langle q_0Z_0p\rangle$ where $ p\in F$ . This non-terminal models the sequence of transitions going from the initial state to the final state, while the empty symbol is popped. This sequence of transitions marks the acceptance of a word.
- If $ \Delta$ contains $ (q,a,X,r,Y_1Y_2)$ , then we build $ \mid K\mid^2$ productions of the form:
- $ qXr_2 \rightarrow a\langle rY_1r_1\rangle\langle r_1Y_2r_2\rangle$ for all $ r_1,r_2\in K$
- in other words, in order to obtain a stack with everything unchanged 'below' $ X$ , starting from $ q$ and ending up in some $ r_2$ , we must eat symbol $ a$ , then from $ r$ we must pop $ Y_1$ , then $ Y_2$ . We do not know what state will be reached after popping $ Y_1$ , so we consider all possible states. The same holds for $ Y_2$ .
- If $ \Delta$ contains $ (q,a,X,r,\epsilon)$ , then we build the production:
- $ \langle qXr \rangle \rightarrow a$ where $ a$ could be the empty string;
Example
Consider the following PDA, which accepts $ L=\{0^n1^n\mid n\geq 0\}$ .
Current state | Input | Stack top | Next state | Stack op |
---|---|---|---|---|
$ q_0$ | $ 0$ | $ Z_0$ | $ q_0$ | $ XZ_0$ |
$ q_0$ | $ 0$ | $ X$ | $ q_0$ | $ XX$ |
$ q_0$ | $ 1$ | $ X$ | $ q_1$ | $ \epsilon$ |
$ q_1$ | $ 1$ | $ X$ | $ q_1$ | $ \epsilon$ |
$ q_1$ | $ \epsilon$ | $ Z$ | $ q_1$ | $ \epsilon$ |
Note that the final transition was not necessary for accepting $ L$ , but was required by our construction. The final state is $ q_1$ .
We first build production:
$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$ which captures the event that we pop the empty stack-symbol starting from the initial state, and ending in the final state (word-acceptance).
The first transition generates the following template production:
$ \langle q_0 Z_0 r_2 \rangle \rightarrow 0 \langle q_0 X r_1\rangle\langle r_1Z_0 r_2\rangle$ with $ r_1,r_2\in \{q_0,q_1\}$ . We have thus defined four productions.
Similarly, the second transition defines:
$ \langle q_0 X r_2 \rangle \rightarrow 0 \langle q_1 X r_1\rangle\langle r_1X r_2\rangle$ which are another four productions.
The third and fourth transitions yield the productions:
$ \langle q_0 X q_1 \rangle \rightarrow 1$
$ \langle q_1 X q_1 \rangle \rightarrow 1$
And the final transition:
$ \langle q_1 Z_0 q_1 \rangle \rightarrow \epsilon$
The complete set of productions is:
$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$
$ \langle q_0 Z_0 q_0 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0Z_0 q_0\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1Z_0 q_0\rangle$
$ \langle q_0 Z_0 q_1 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0Z_0 q_1\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1Z_0 q_1\rangle$
$ \langle q_0 X q_0 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0X q_0\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1X q_0\rangle$
$ \langle q_0 X q_1 \rangle \rightarrow 1 \mid 0 \langle q_0 X q_0\rangle\langle q_0X q_1\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1X q_1\rangle$
$ \langle q_1 X q_1 \rangle \rightarrow 1$
$ \langle q_1 Z_0 q_1 \rangle \rightarrow \epsilon$
To make sense of this grammar, note that non-terminals $ \langle q_1 X q_0\rangle$ and $ \langle q_1 Z_0 q_0 \rangle$ do not appear in the LHS of a production, hence any derivation that includes rules containing them will get stuck.
We eliminate such rules. The result is:
$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$
$ \langle q_0 Z_0 q_0 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0Z_0 q_0\rangle$
$ \langle q_0 Z_0 q_1 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0Z_0 q_1\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1Z_0 q_1\rangle$
$ \langle q_0 X q_0 \rangle \rightarrow 0 \langle q_0 X q_0\rangle\langle q_0X q_0\rangle$
$ \langle q_0 X q_1 \rangle \rightarrow 1 \mid 0 \langle q_0 X q_0\rangle\langle q_0X q_1\rangle \mid 0 \langle q_0 X q_1\rangle\langle q_1X q_1\rangle$
$ \langle q_1 X q_1 \rangle \rightarrow 1$
$ \langle q_1 Z_0 q_1 \rangle \rightarrow \epsilon$
Next, we observe that the rules having $ \langle q_0 Z_0 q_0 \rangle$ and $ \langle q_0 X q_0\rangle$ continue to generate non-terminals, ad-infinitum. Derivations including them will never produce actual words. For this reason, we ignore such rules:
$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$
$ \langle q_0 Z_0 q_1 \rangle \rightarrow 0 \langle q_0 X q_1\rangle\langle q_1Z_0 q_1\rangle$
$ \langle q_0 X q_1 \rangle \rightarrow 1 \mid 0 \langle q_0 X q_1\rangle\langle q_1X q_1\rangle$
$ \langle q_1 X q_1 \rangle \rightarrow 1$
$ \langle q_1 Z_0 q_1 \rangle \rightarrow \epsilon$
Also, we can remove the final two productions, and replace the occurrence of the l.h.s. non-terminals with $ \epsilon$ (resp. $ 1$ ), which yields:
$ S \rightarrow \langle q_0 Z_0 q_1 \rangle$
$ \langle q_0 Z_0 q_1 \rangle \rightarrow 0 \langle q_0 X q_1\rangle$
$ \langle q_0 X q_1 \rangle \rightarrow 1 \mid 0 \langle q_0 X q_1\rangle 1$
Finally, we can merge the two productions, and write $ A$ instead of $ \langle q_0 X q_1 \rangle$ which produces an easy-to-read grammar:
$ S \rightarrow 0 A$
$ A \rightarrow 1 \mid 0 A 1$