PCP is undecidable

In this section, we will prove that Post's Correspondence Problem is undecidable. Compared with the other proofs of this Chapter, PCP undecidability will be more involved. Basically, the proof is a two-step reduction from the halting problem.

The core idea of the proof is to construct a set of pairs $ \alpha_i, \beta_i$ , such that adding a pair to a partial matching corresponds to a transition of the Turing Machine.

Informally, a word $ \beta_i$ will describe a Turing Machine configuration, i.e. the current state, the head position, and the current word on the tape.

The bottom partial matching will always be longer than the upper one.

A complete matching will represent a sequence of configurations of the Turing Machine, starting from the initial state, and ending in a final (accepting) state.

In order to make the proof work, we need a few technical fixes:

  • we need to restrict PCP in order to force the selection of the first tile, for any possible matching
  • we need to add some (minor) restrictions to the Turing Machines which are the input of the halting problem.

We shall describe these restrictions below:

  • for this proof, we consider that Turing Machines have only two final states $ F=\{s_{yes},s_{no}\}$ , which model the output of $ 0/1$ . It is straightforward how one can take an arbitrary Turing Machine an turn it into a yes/no-Turing Machine.
  • We require that the execution of $ M$ on $ w$ never moves the head to the left of the first symbol of the word. Making sure this happens is a little more technically involved, however algorithmically possible. We shall skip these details.
  • We require that the word $ w \neq \epsilon $, hence we do not allow simulations of the empty word. To accommodate for this, we can easily add a new symbol to the alphabet, which is used specifically for encoding the empty word.

Definition (FTPCP):

Let $ \alpha_1, \ldots, \alpha_n$ and $\beta_1, \ldots, \beta_n$ be sequences of words over a fixed alphabet. There exists a finite sequence $ a_1a_2 \ldots a_k$ , with $ a_1 = 1$ and $ a_i = 1, \ldots, n$ such that:

$ \alpha_{a_1}\alpha_{a_2}\ldots \alpha_{a_k} = \beta_{a_1}\beta_{a_2}\ldots \beta_{a_k}$

We show $ f_h\leq_T$ FTPCP.

Let $ M$ and $ w=c_1\ldots c_n$ be an input of the halting problem. We construct a matrix of $ \alpha_i / \beta_i$ pairs such that a match exists for FTPCP iff $ M(w)$ halts.

Step 1 (the first pair):

$ \left(\begin{array}{l} \alpha_1 \\ \beta_1 \end{array}\right) = \left(\begin{array}{l} \# \\ \# s_0c_1\ldots c_n \# \end{array}\right)$

  • The first tile must correspond to the initial configuration of $ M$ . Note that the alphabet of PCP is $ \Sigma\cup K$ - it also contains the states of $ M$ as symbols.
  • Also, the position of the state symbol is used to represent the head position. In the initial state, this points to the first symbol of the input word, namely $ c_1$ .

Example

Suppose we have $ \Sigma = \{\#,0,1\}$ and the input word $ w=100$ . Then, any matching of FTPCP must start as follows:

$ \left(\begin{array}{l} \# \\ \# s_0100 \# \end{array}\right)$

Step 2 (moving right transitions):

For every pair of symbols $ c,c'\in \Sigma$ and states $ s,s' \in K$ where $ s\neq s_{no}$ such that $ \delta(s,c)=(s',c',R)$ , construct the tile:

$ \left(\begin{array}{l} sc \\ c's' \end{array}\right)$

  • when a matching with $ sc$ is performed, this indicates that the current state is $ s$ , and the head position indicates $ c$ .

Example Suppose in our machine we have the transition $ \delta(s_0,1)=(s_1,0,R)$ . Then we can continue our matching as follows:

$ \left(\begin{array}{l} \#s_01 \\ \# s_0100 \# 0s_1 \end{array}\right)$

Note that on the bottom part, we have partially constructed the next configuration of the Turing Machine. The current state is now $ s_1$ and the head position indicates $ 0$ . We need the following ingredients to obtain the complete configuration description in the bottom part:

Step 3 (completing configurations)

For every $ c\in \Sigma$ , construct the tile: $ \left(\begin{array}{l} c \\ c \end{array}\right)$

Example

We can now continue our matching by selecting tiles from Step 3. We obtain:

$ \left(\begin{array}{l} \#s_0100\# \\ \# s_0100 \# 0s_100\# \end{array}\right)$

Step 4 (moving left transitions)

For every $ c,c',c''\in \Sigma$ and every $ s,s'\in K$ where $ s\neq s_{no}$ such that: $ \delta(s,c) = (s',c',L)$ , add tiles:

$ \left(\begin{array}{l} c''sc \\ s'c''c' \end{array}\right)$

  • Unlike the previous step, where we add one tile per transition, here we add $ $ tiles. The intuition is as follows: for each left transition, and each possible previous symbol $ c''$ , we add the corresponding transition.

Example

Continuing the previous example, suppose we have transition $ \delta(s_1,0) = (s_2,1,L)$ , We extend the matching by selecting the appropriate tile with $ c''=0$ :

$ \left(\begin{array}{l} \#s_0100\#0s_10 \\ \# s_0100 \# 0s_100\#s_201 \end{array}\right)$

and by completing the configuration, we get:

$ \left(\begin{array}{l} \#s_0100\#0s_100\# \\ \# s_0100 \# 0s_100\#s_2010\# \end{array}\right)$

Step 5: (hold transitions)

Can you figure this step by yourself?

Step 6: (completing a match)

For every $ c\in\Sigma\setminus\{\#\}$ , we add tiles:

$ \left(\begin{array}{l} cs_{yes} \\ s_{yes} \end{array}\right)$ , $ \left(\begin{array}{l} s_{yes}c \\ s_{yes} \end{array}\right)$

  • these pairs of tiles consume the rest of the input (from both the left and right-side of the head), when the yes-state is reached.

Example

Suppose that, in our previous example $ s_2 = s_{yes}$ . Then we can attempt to complete the matching as follows:

$ \left(\begin{array}{l} \#s_0100\#0s_100\# \\ \# s_0100 \# 0s_100\#s_{yes}010\# \end{array}\right)$

$ \left(\begin{array}{l} \#s_0100\#0s_100\# s_{yes}0 \\ \# s_0100 \# 0s_100\#s_{yes}010\# s_{yes} \end{array}\right)$

$ \left(\begin{array}{l} \#s_0100\#0s_100\# s_{yes}010\# \\ \# s_0100 \# 0s_100\#s_{yes}010\# s_{yes}10\# \end{array}\right)$

$ \left(\begin{array}{l} \#s_0100\#0s_100\# s_{yes}010\# s_{yes}1 \\ \# s_0100 \# 0s_100\#s_{yes}010\# s_{yes}10\# s_{yes} \end{array}\right)$

$ \left(\begin{array}{l} \#s_0100\#0s_100\# s_{yes}010\# s_{yes}10\# \\ \# s_0100 \# 0s_100\#s_{yes}010\# s_{yes}10\# s_{yes}0\# \end{array}\right)$

$ \left(\begin{array}{l} \#s_0100\#0s_100\# s_{yes}010\# s_{yes}10\# s_{yes}0 \\ \# s_0100 \# 0s_100\#s_{yes}010\# s_{yes}10\# s_{yes}0\# s_{yes} \end{array}\right)$

$ \left(\begin{array}{l} \#s_0100\#0s_100\# s_{yes}010\# s_{yes}10\# s_{yes}0\# \\ \# s_0100 \# 0s_100\#s_{yes}010\# s_{yes}10\# s_{yes}0\# s_{yes}\# \end{array}\right)$

Step 7: (final touches)

What more is needed ?

We prove PCP is undecidable by showing FTPCP $ \leq_T$ PCP. Let $ \left\{\left(\begin{array}{l} \alpha_i \\ \beta_i \end{array}\right)\right\}_{ 1 \leq i \leq n}$ be an instance of FTPCP. To build an instance of PCP, we need the following ingredients.

Consider an arbitrary word $ w=c_1\ldots c_k$ over alphabet $ \Sigma$ , and the symbol $ *$ which does not occur in $ \Sigma$ . We define the words:

  • $ \lhd w = *c_1*c_2* \ldots *c_k$
  • $ w \rhd = c_1*c_2* \ldots *c_k*$
  • $ \lhd w \rhd = *c_1*c_2* \ldots *c_k*$

by inserting $ *$ to the left, right and left & right of every symbol in $ w$ .

Using our word constructions, we can construct the following instance of PCP: $ \left\{\left(\begin{array}{l} \lhd \alpha_i \\ \lhd \beta_i \rhd \end{array}\right)\right\} \cup \left\{\left(\begin{array}{l} \lhd \alpha_i \\ \beta_i \rhd \end{array}\right)\right\}_{ 1 \leq i \leq n} \cup \left\{\left(\begin{array}{l} \lhd @ \\ @ \end{array}\right)\right\}$

where $ *$ and $ @$ are symbols outside of the alphabet of the PCP instance.

We prove the construction is correct. Direction “$ \Rightarrow$ ”. Suppose there is a matching of FTPCP starting with the first tile and continuing with $ a_2,\ldots,a_k$ . Then choose the pair $ \left(\begin{array}{l} \lhd \alpha_1 \\ \lhd \beta_1 \rhd \end{array}\right)$ , and continue with the pairs $ a_2=\left(\begin{array}{l} \lhd \alpha_{a_2} \\ \beta_{a_2} \rhd \end{array}\right)$ up to $ a_k$ , which will constitute a matching.

Direction “$ \Leftarrow$ ”. Suppose we have a matching in our constructed PCP instance. Since $ *$ can only appear between other symbols from $ \Sigma$ , it follows that any matching can only start with the pair $ \left(\begin{array}{l} \lhd \alpha_1 \\ \lhd \beta_1 \rhd \end{array}\right)$ , and must continue with pairs $ \left(\begin{array}{l} \lhd \alpha_i \\ \beta_i \rhd \end{array}\right)$ , from which we can immediately construct a matching in FTPCP.