====== - Computability Theory ====== ===== - Motivation ===== Goldbach conjecture [Matei: https://en.wikipedia.org/wiki/Wang_tile#Applications] ===== - Problems and problem instances ===== In the previous section, we have illustrated the problem SAT, as well as a pseudocode which describes a solution in exponential time. We have seen that such a solution is infeasible in practice, and also that no (predictible) technological advance can help. The main question we asked (and also answered) is whether there exists a faster procedure to solve SAT. (We have conjectured that the answer is //no//, or, more precisely //not likely//.) To generalize a little, we are interested in the following question: Can a given problem //Q// be solved in "//efficient//" time? For now, we go around the currently absent definition for "//efficient//", and note that such a question spawns another (which is actually more straightforward to ask): Can a given problem //Q// be "//solved//" at all? In this chapter, we shall focus on answering this question, first, and to do so, we need to settle the following issues: * What exactly is a "//problem//"? * What does it mean to "//solve//" a problem? $def[Problem instance] //A// problem instance //is a mathematical object of which we ask a question and expect an answer.// $end $def[Problem] //An (abstract)// problem //is a mapping $math[P:I \rightarrow O] where $math[I] is a set of problem instances of which we ask the same question, and $math[O] is a set of answers. $math[P] assigns to each problem instance $ i \in I$ the answer $ P(i)\in O$.// $end It is often the case that the answers we seek are also mathematical objects. For instance, the vector sorting problem must be answered by a sorted vector. However, many other problems prompt //yes/no// answers. Whenever $math[O = \{0, 1\}] we say that $math[P] is a //decision problem//. Many other problems can be cast into decision problems. The vector sorting problem may be seen as a decision problem, if we simply ask whether the problem instance (i.e. the vector) is sorted. The original sorting problem and it’s decision counterpart may not seem equivalent in terms of hardness. For instance, sorting is solved in polynomial time $math[n\log n] (using “standard” algorithms) which deciding whether a vector is sorted can be done in linear time. We shall see that, from the point of view of Complexity Theory, both problems are equally hard. Last definitions (Problem and Problem instance) may seem abstract and unusable for the following reason: the set $math[I] is hard to characterize. One solution may be to assign //types// to problem instances. For example //graph// may be a problem instance type. However, such a choice forces us to reason about problems separately, based on the type of their problem instances. Also types themselves are an infinite set, which is also difficult to characterize. Another approach is to //level out// problem instances, starting from the following key observations: (i) each $math[i \in I] must be, in some sense finite. For instance, vectors have a finite length, hence a finite number of elements. Graphs (of which we ask our questions) also have a finite set of nodes, hence, a finite set of edges, etc. (ii) $math[I] must be //countable// (but not necessarily finite). For instance, the problem $math[P : \mathbb{R} \times \mathbb{R} \rightarrow \{0, 1\}] where $math[P(x, y)] returns $math[1] if $math[x] and $math[y] are equal, has no sense from the point of computability theory. Assume we would like to answer $math[P (\pi, \sqrt{2})]. Simply storing $math[\pi] and $math[\sqrt{2}], which takes infinite space, is impossible on machines, and also takes us to point (i). The observations suggest that problem instances can be represented via a //finite encoding//, which may be assumed to be uniform over all possible mathematical objects we consider. $def[Encoding problem instances] //Let// $math[\Sigma] //be a finite set whom we call// alphabet. //A// one-letter //word is a member of// $math[\Sigma]. //A// two-letter //word is any member of// $math[\Sigma \times \Sigma = \Sigma^2] . //For instance, if $math[\Sigma = \{a, b . . .\}], then// $math[(a, a) \in \Sigma^2] //is a two-letter word//.// An// i-letter //word is a member of// $math[\Sigma^i] . We denote by: $math[\Sigma^∗ = \{\epsilon\} \cup \Sigma \cup \Sigma^2 \cup \ldots \cup \Sigma^i \cup \ldots] //the set of finite words which can be formed over// $math[\Sigma]. //$math[\epsilon] is a special word which we call// empty word. //Instead of writing, e.g.// $math[(a, b, b, a, a)] //for a 5-letter word, we simply write abbaa//. //Concatenation of two words is defined as usual//. $end ==== - Remark ==== //We shall henceforth consider that a problem $math[P : I \rightarrow O] has the following property: if $math[I] is infinite, then $math[I \simeq \Sigma^*] ($math[I] is isomorphic with $math[\Sigma^∗] ). Thus, each problem instance $math[i] can be represented as a finite word $math[enc(i) \in \Sigma^*] , for some $math[\Sigma^∗] .// We shall postpone, for now, the question of choosing the appropriate $math[\Sigma] for our problem (the above remark simply states that such a $math[\Sigma] must exist). Encoding problem instances definition and the remark 1.2.1 can be easily recognized from practice. A programmer always employs the same predefined mechanisms of his programming language (the available datatypes) to represent his program inputs. Moreover, these objects ultimately become streams of bits, when they are actually processed by the machine. Making one step further, we can observe the following property of alphabets, which conforms with (ii): ==== - Proposition ==== //For any finite// $math[\Sigma]//,// $math[\Sigma^*] //is infinitely countable//. //Proof//: We show $math[\Sigma^* \simeq \mathbb{N}]. We build a bijective function //$math[h]// which assigns to each word, a unique natural. We assign $math[0] to $math[\epsilon]. Assume $math[\mid \Sigma \mid = n]. We assign to each one-letter word, the numbers from $math[1] to //$math[n]//. Next, we assign to each// $math[k]//$math[\geq] 2-letter word $math[w = w^\prime x] the number $math[n ∗ h(w^\prime) + h(x)]. If $math[n = 2] we easily recognise that each binary word is assigned to his natural equivalent. Hence, we have the following diagram: $math[i \in I \leftrightarrow enc(i) \in \Sigma^* \leftrightarrow h(enc(i)) \in \mathbb{N}] Hence, we can view a problem instance as a natural number, without losing the ability to uniquely identify the instance at hand. Thus: $def[Problem] //A problem is a function// $math[f : \mathbb{N} \rightarrow \mathbb{N}]//. If some n encodes a problem input, then $math[f(n)] encodes it’s answer.// //A decision problem is a function// $math[f : \mathbb{N} \rightarrow ]{0, 1}. $end To conclude: when trying to solve concrete problems, the encoding issue is fundamental, and this is dependent on the type of problem instances we tackle. From the perspective of Computability Theory which deals with problems in general, the encoding is unessential, and can be abstracted without “//loss of information//” by a natural number. ===== - Algorithms as Turing Machines ===== Algorithms are usually described as pseudo-code, and intended as abstractions over concrete programming language operations. The level of abstraction is usually unspecified rigorously, and is decided in an ad-hoc manner by the writer. From the author’s experience, pseudo-code is often dependent on (some future) implementation, and only abstracts from language syntax, possibly including data initialization and subsequent handling. Thus, some pseudo-code can be easily implemented in different languages to the extent to which the languages are the same, or at least follow the same programming principles. The above observation is not intended as a criticism towards pseudo-code and pseudo-code writing. It is indeed difficult, for instance, to write pseudocode which does not seem vague, and which can be naturally implemented in an imperative language (using assignments and iterations) as well as in a purely functional language (where iterations are possible only through recursion). As before, we require a means for //leveling out// different programming styles and programming languages, in order to come up with a uniform, straightforward and simple definition for an algorithm. The key observation here is that programming languages, especially the newest and most popular, are quite restrictive w.r.t. to what the programmer can do. This may seem counter-intuitive at first. Consider typed languages for instance. Enforcing each variable to have a type is obviously a restriction, and has a definite purpose: it helps the programmer write cleaner code, and one which is less likely to crash at runtime. However, this issue is irrelevant from the point of view of Computability Theory. If we try to search for less restrictive languages, we find the assembly languages. The restrictions are minimal here (as well as the “programming structure”). The formal definition for an algorithm which we propose, can be seen as an abstract assembly language, where all technical aspects are put aside. We call such a language the Turing Machine. $def[Deterministic Turing Machine] //A// Deterministic Turing Machine //(abbreviated DTM) is a tuple $math[M = (K, F, \Sigma, \delta, s_0)] where:// * $math[\Sigma = \{a, b, c, ...\}] //is a finite set of symbols wich we call //alphabet//;// * $math[K] //is a set of states, and $math[F \subseteq K] is a set of //accepting/final states//;// * $math[\delta:K\times\Sigma\rightarrow K\times\Sigma\times\{L,H,R\}] //is a transition function which assigns to each state $math[s\in K] and $math[c\in\Sigma] the triple $math[\delta(s,c)=(s^\prime,c^\prime,pos)];// * $math[s_0\in K] //is an //initial state//.// //The Turing Machine has a tape which contains infinite cells in both directions, and on each tape cell we have a symbol from $math[\Sigma]. The Turing Machine has a// tape head//, which is able to read the symbol from the current cell. Also, the Turing Machine is always in a given state. Initially (before the machine has started) the state is $math[s_0]. From a given state $math[s], the Turing Machine reads the symbol $math[c] from the current cell, and performs a// transition//. The transition is given by $math[\delta(s, c) = (s^\prime , c^\prime , pos)]. Performing the transition means that the $math[TM] moves from state $math[s] to $math[s^\prime] , overrides the symbol $math[c] with $math[c^\prime] on the tape cell and: (i) if $math[pos = L] moves the tape head on the next cell to the left, (ii) if $math[pos = R] moves the tape head on the next cell to the right and (iii) $math[pos = H] leaves tape head on the current cell.// //The Turing Machine will perform transitions according to $math[\delta].// //Whenever the TM reaches an accepting/final state, we say it// halts//. If a $math[TM] reaches a non-accepting state where no other transitions are possible, we say it// clings/hangs//.// * //the //input// of a Turing Machine is a finite word which is contained in its otherwise empty tape;// * //the //output// of a $math[TM] is the contents of the tape (not including empty cells) after the Machine has halted. We also write $math[M(w)] to refer to the output of $math[M], given input $math[w].// $end [Matei: Comments on the Turing Machine vs. Programming Languages. The TM is resource-unbound!] $example[Turing Machine] //Consider the alphabet $math[\Sigma = \{\#, >, 0, 1\}], the set of states $math[K = \{s_0, s_1, s_2\}], the set of final states $math[F = \{s_2\}] and the transition function:// $math[\delta(s_0, 0) = (s_0, 0 ,R) \quad \quad \delta(s_0, 1) = (s_0, 1, R)] $math[\delta(s_0, \#) = (s_1, \# ,L) \quad \quad \delta(s_1, 1) = (s_1, 0, L)] $math[\delta(s_1, 0) = (s_2, 1 ,H) \quad \quad \delta(s_1, >) = (s_2, 1, H)] //The Turing Machine $math[M = (K, F, \Sigma, \delta, s_0)] reads a number encoded in binary on the tape, and increments it by $math[1]. The symbol $math[\#] encodes the empty cell tape //(We shall use $math[\#] to refer to the empty cell, throught the text)//. Initially, the tape head is positioned at the most significant bit of the number. The Machine first goes over all bits, from left to right. When the first empty cell is detected, the machine goes into state $math[s_1], and starts flipping $math[1]s to $math[0]s, until the first $math[0] (or the initial position, marked by $math[>]) is detected. Finally, the machine places $math[1] on this current cell, and enters it’s final state.// {{ :aa:intro:aafigura_2.3.1.jpg?nolink&400 |}} //The behaviour of the transition function can be more intuitively represented as in the figure above. Each node represents a state, and each edge, a transition. The label on each edge is of the form $math[c/c^\prime,pos] where $math[c] is the symbol read from the current tape cell, $math[c^\prime] is the symbol written on the current tape cell and $math[pos] is a tape head position. The label should be read as: //the machine replaces $math[c] with $math[c^\prime] on the current cell tape and moves in the direction indicated by $math[pos]//.// //Let us consider that, initially, on the tape we have $math[>0111] — the representation of the number $math[7]. The evolution of the tape is shown below. Each line shows the //$math[TM] configuration at step $math[i]//, that is, the tape and current state, after transition $math[i]. For convenience, we have chosen to show two empty cells in each direction, only. Also, the underline indicates the position of the tape head.// ^ Transition no ^ Tape ^ Curent state ^ | 0 | ##__>__0111## | $math[s_0] | | 1 | ##>__0__111## | $math[s_0] | | 2 | ##>0__1__11## | $math[s_0] | | 3 | ##>01__1__1## | $math[s_0] | | 4 | ##>011__1__## | $math[s_0] | | 5 | ##>0111__#__# | $math[s_1] | | 6 | ##>011__0__## | $math[s_1] | | 7 | ##>01__0__0## | $math[s_1] | | 8 | ##>0__0__00## | $math[s_1] | | 9 | ##>__1__000## | $math[s_2] | $end In order to better understand the Turing Machine, it is useful to establish some similarities with, e.g. assembly languages. As speci ed in Denition of Deterministic Turing Machine, a Turing Machine is $math[M] specifi es a clearly de fined behaviour, which is actually captured by $math[\delta]. Thus, $math[M] is quite similar to a speci c //program//, performing a de nite task. If programs (algorithms) are abstracted by Turing Machine, then what is the abstraction for the programming language? The answer is, again, the Turing Machine. This implies that, a Turing Machine acting as a programming language, can be fed another Turing Machine acting as program, and execute it. In the following Proposition, we show how Turing Machines can be encoded as words: ==== - Proposition (TMs as words) ==== //Any Turing Machine $math[M = (K, F, \Sigma, \delta, s_0)] can be encoded as a word over $math[\Sigma]. We write $math[enc(M)] to refer to this word.// //Proof:(sketch)// Intuitively, we encode states and positions as integers $math[n \in $mathbb{N}], transitions as pairs of integers, etc. and subsequently //"convert"// each integer to it's word counterpart in $math[\Sigma^*], cf. Proposition 1.2.2. Let $math[NonFin = \mid K \setminus F \setminus \{s_0\} \mid] be the set of non-fi nal states, excluding the initial one. We encode each state in $math[NonFin] as an integer in $math[\{1, 2, ..., NonFin\}] and each final state as an integer in $math[\{\mid NonFin \mid +1, ..., \mid NonFin \mid + \mid F \mid \}]. We encode the initial state $math[s_0] as $math[\mid NonFin \mid + \mid F \mid + 1], and L,H,R as $math[\mid NonFin \mid + \mid F \mid + i] with $math[i \in \{2,3,4\}]. Each integer from the above is represented as a word using $math[\lceil{ log_{\mid \Sigma \mid} {(\mid NonFin \mid + \mid F \mid + 4)} }\rceil] bits. Each transition $math[\delta(s, c) = (s^\prime, c^\prime, pos)] is encoded as: $math[enc(s)\#c\#enc(s^\prime )\#c^\prime \#enc(pos)] where $math[enc(\cdot)] is the encoding described above. The entire $math[\delta] is encoded a sequence of encoded transitions, separed by $math[\#]. The encoding of $math[M] is $math[enc(M) = enc(\mid NonFin\mid)\#enc(\mid F\mid)\#enc(\delta)] Thus, $math[enc(M)] is a word, which can be fed to another Turing Machine. The latter should have the ability to execute (or to simulate) $math[M]. This is indeed possible: ==== - Proposition (The Universal Turing Machine) ==== //There exists a $math[TM] $math[U] which, for any $math[TM] $math[M], and every word $math[w \in \Sigma^*], takes $math[enc(M)] and $math[w] as input and outputs $math[1] whenever $math[M(w) = 1] and $math[0] whenever $math[M(w) = 0]. We call $math[U], the// Universal Turing Machine//, and say that $math[U] simulates $math[M].// //Proof:// Let $math[M] be a $math[TM] and $math[w = c_1c_2 ... c_n] be a word which is built from the alphabet of $math[M]. We build the Universal Turing Machine $math[U], as follows: * The input of $math[U] is $math[enc(M)\#enc(s_0)\#c_1\#c_2 ... c_n]. Note that $math[enc(s_0)] encodes the initial state of $math[M] while $math[c_1] is the fi rst symbol from $math[w]. The portion of the tape $math[enc(s_0)\#c_1\#c_2 ... c_n] will be used to mark the current con guration of $math[M], namely the current state of $math[M] (initially $math[s_0]), the contents of $math[M]'s tape, and $math[M]'s current head position. More generally, this portion of the tape is of the form $math[enc(s_i)\#u\#v], with $math[u, v \in \Sigma_b^*] and $math[s_i] being the current state of $math[M]. The last symbol of $math[u] marks the current symbol, while $math[v] is the word which is to the left of the head. Initially, the current symbol is the first one, namely $math[c_1]. * $math[U] will scan the initial state of $math[M], then it will move on the initial symbol from $math[w], and finally will move on the portion of $math[enc(M)] were transitions are encoded. Once a valid transition is found, it will execute it: - $math[U] will change the initial state to the current one, according to the transition; - $math[U] will change the original symbol in $math[w] according to the transition; - $math[U] will change the current symbol, according to $math[pos], from the transition; * $math[U] will repeat this process until an accepting state of $math[M] is detected, or until no transition can be performed. Propositions 1.3.2 and 1.3.1 show that TMs have the capability to characterize both algorithms, as well as the computational framework to execute them. One question remains: what can TMs actually compute? Can they be used to sort vectors, solve SAT, etc.? The answer, which is positive is given by the following hypothesis: ==== - Conjecture (Church-Turing) ==== //Any problem which can be solved with the Turing Machine is //"universally solvable"//.// The term //"universally solvable"// cannot be given a precise mathematical definition. We only know solvability w.r.t. concrete means, e.g. computers and programming languages, etc. It can be (an has been) shown that the Turing Machine can solve any problem which known programming languages can solve (To be fair to the TM, one would formulate this statement as: "all programming languages are Turing-complete, i.e. they can solve everything the TM can solve". ). The Turing Machine, in itself, describes a //model of computation// based on side-e fects: each transition may modify the tape in some way. Computation can be described dif erently, for instance: as function application, or as term rewriting. However, all other known computational models are equivalent to the Turing Machine, in the sense that they solve precisely the same problems. This observation prompted the aforementioned conjecture. It is strongly believed to hold (as evidence suggests), but it cannot be formally proved. ===== - Decidability ===== [Matei: The fact that the totality problem is undecidable means that we cannot write a program that can find any infi nite loop in any program. The fact that the equivalence problem is undecidable means that the code optimization phase of a compiler may improve a program, but can never guarantee finding the optimally eficient version of the program. There may be potentially improved versions of the program that it cannot even be sure are equivalent.] The existence of the Universal Turing Machine $math[(U)] inevitably leads to interesting questions. Assume $math[M] is a Turing Machine and $math[w] is a word. We use the following convention: $math[enc(M)\_w] in order to represent the input of the $math[U]. Thus, $math[U] expects the encoding of a $math[TM], followed by the special symbol $math[\_], and $math[M]'s input $math[w]. $math[(\star)] Does $math[U] halt for all inputs? If the answer is positive, then $math[U] can be used to tell whether any machine halts, for a given input. We already have some reasons to believe we cannot answer positively to $math[(\star)], if we examine the proof of Proposition 1.3.2. Actually $math[(\star)] is a decision problem, one that is quite interesting and useful. As before, we try to lift our setting to a more general one: Can any problem be solved by some Turing Machine. The following propositions indicate that it's not likely the case: ==== - Proposition ==== //The set $math[\mathcal{TM}] of Turing Machines is countably infinite.// //Proof:// The proof follows immediately from Proposition 1.3.1. Any Turing Machine can be uniquely encoded by a string, hence the set of Turing Machines is isomorphic to a subset of $math[\Sigma^*], which in turn is countably in finite, since $math[\Sigma^*] is countably in finite, for any $math[\Sigma]. ==== - Proposition ==== //The set $math[Hom(\mathbb{N}, \mathbb{N})] of functions $math[f : \mathbb{N} \rightarrow \mathbb{N}] is uncountably infinite.// //Proof:// It is sucient to show that $math[Hom(\mathbb{N}, \{0, 1\})] is uncountably infi nite. We build a proof by contraposition. We assume $math[Hom(\mathbb{N}, \{0, 1\})] is countably infi nite. Hence, each natural number $math[n \in \mathbb{N}] corresponds to a function $math[f_n \in Hom(\mathbb{N}, \{0, 1\})]. We build a matrix as follows: Columns describe functions $math[f_n : n \in \mathbb{N}]. Rows describe inputs $math[k \in \mathbb{N}]. Each matrix content $math[m_{i,j}] is the value of $math[f_j(i)] (hence, the expected output for input $math[i], from function $math[f_j]). $math[\begin{array}{ll} \mbox{ } & f_0 & f_1 & f_2 & \ldots & f_n & \ldots \\ 0 & 1 & 1 & 0 & \ldots & 0 & \ldots \\ 1 & 0 & 1 & 1 & \ldots & 0 & \ldots \\ 2 & 1 & 0 & 1 & \ldots & 1 & \ldots \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \\ n & 1 & 1 & 0 & \ldots & 1 & \ldots \\ \ldots & \ldots & \ldots & \ldots & \ldots & \ldots & \ldots \end{array} \\ \mbox{Figure 1.4.2}] Figure 1.4.2: An example of the matrix of the proof of Proposition 1.4.2. The value $math[m_{i,j}] have been fi lled out purely for the illustration. In the Figure 1.4.2, we have illustrated our matrix. We now devise a problem $math[f^*] as follows: $math[f(x)=\left\{\begin{array}{ll}1 & \mbox{iff } f_x(x)=0 \\ 0 & \mbox{iff } f_x(x)=1 \end{array} \right.] Since $math[f^* \in Hom(\mathbb{N}, \{0, 1\})] it must also have a number assigned to it: $math[f^* = f_\alpha] for some $math[\alpha \in \mathbb{N}]. Then $math[f^*(\alpha ) = 1] if $math[f_\alpha (\alpha ) = 0]. But $math[f_\alpha (\alpha ) = f^*(\alpha )]. Contradiction. On the other hand $math[f^*(\alpha ) = 0] if $math[f_\alpha (\alpha ) = 1]. As before we obtain a contradiction. Propositions 1.4.2 and 1.4.1 tell us that there are in finitely more functions (decision problems) what means of computing them (Turing Machines). Our next step is to look at solvable and unsolvable problems, and devise a method for separating the fi rst from the latter. In other words, we are looking for a tool which allows us to identify those problems which are solvable, and those which are not. We start by observing that Turing Machines may never halt. We write $math[M(w) = \perp] to designate that $math[M] loops infi nitely for input $math[w]. Also, we write $math[n^w \in \mathbb{N}] to refer to the number which corresponds to $math[w], according to Proposition 1.2.2. Next, we re fine the notion of //problem solving//: $def[Decision, acceptance] //Let $math[M] be a Turing Machine and $math[f \in Hom(\mathbb{N}, \{0, 1\})].We say that:// * //$math[M] **decides** $math[f], iff for all $math[n \in \mathbb{N}]: $math[M(w) = 1] whenever $math[f(n^w) = 1] and $math[M(w) = 0] whenever $math[f(n^w) = 0].// * //$math[M] **accepts** $math[f] iff for all $math[n \in \mathbb{N}]: $math[M(w) = 1] iff $math[f(n^w) = 1], and $math[M(w) = \perp] iff $math[f(n)=0].// $end Note that, in contrast with acceptance, decision is, intuitively, a stronger means of computing a function (i.e. solving a problem). In the latter case, the $math[TM] at hand can provide both a //yes// and a //no// answer to any problem instance, while in the former, the $math[TM] can only provide an answer of //yes//. If the answer to the problem instance at hand is //no//, the $math[TM] will not halt. Based on the two types of problem solving, we can classify problems (functions) as follows: $def[R and RE] //Let $math[f \in Hom(\mathbb{N}, \{0, 1\})] be a decision problem.// * //$math[f] is **recursive** (decidable) iff there exists a $math[TM] $math[M] which decides $math[f]. The set of recursive functions is $math[R = \{f \in Hom(\mathbb{N}, \{0,1\}) \mid \mbox{f is recursive} \}]// * //$math[f] is **recursively enumerable** (//semi-decidable//) iff there exists a $math[TM] $math[M] which accepts $math[f]. The set of recursive-enumerable function is $math[RE = \{f \in Hom(\mathbb{N},\{0,1\}) \mid \mbox{f is recursively-enumerable} \}]// $end Now, let us turn our attention to question $math[\star], which we shall formulate as a problem: $math[f_h(n^{enc(M)\_w})= \left\{ \begin{array}{ll} 1 & \mbox{iff } M(w) \mbox{halts} \\ 0 & \mbox{iff } M(w) = \perp \end{array} \right.] Hence, the input of $math[f] is a natural number which encodes a Turing Machine $math[M] and an input word $math[w]. The fi rst question we ask, is whether $math[f_h \in R]. ==== - Proposition ==== $math[f_h \notin R]. //Proof://Assume $math[f_h \in R] and denote by $math[M_h] the Turing Machine which decides $math[f_h]. We build the Turing Machine $math[D], as follows: $math[D(enc(M)) = \left\{ \begin{array}{ll} \perp & \mbox{iff } M_h(enc(M)\_enc(M)) = 1 \\ 1 & \mbox{iff } M_h(enc(M)\_enc(M)) = 0 \end{array} \right.] The existence of the Universal Turing Machine guarantees that $math[D] can indeed be built, since $math[D] simulates $math[M_h]. We note that $math[M_h(enc(M)\_enc(M))] decides if the $math[TM] $math[M] halts with "//itself//" as input (namely $math[enc(M)]). Assume $math[D(enc(D)) = 1]. Hence $math[M_h(enc(D),enc(D)) = 0], that is, machine $math[D] does not halt for input $math[enc(D)]. Hence $math[D(enc(D)) = \perp]. Contradiction. Assume $math[D(enc(D)) = \perp]. Hence $math[M_h(enc(D),enc(D)) = 1], and thus $math[D(enc(D))] halts. Contradiction. We note that the construction of $math[D] mimics the technique which we applied for the proof of Proposition 1.4.2, which is called //diagonalization//. ==== Exercise ==== //Apply the diagonalization technique from the proof of Proposition 1.4.2, in order to prove Proposition 1.4.3.// ==== - Proposition ==== $math[f_h \in RE] //Proof:// We build a Turing Machine $math[M_h] which accepts $math[f_h]. Essentially, $math[M_h] is the Universal Turing Machine. $math[M_h(enc(M)\_w)] simulates $math[M], and if $math[M(w)] halts, then it outputs 1. If $math[M(w)] does not halt, $math[M_h(enc(M)\_w) = \perp].  Propositions 1.4.3 and 1.4.4 produce a classi cation for $math[f_h]. The question which we shall answer, is how to classify any problem $math[f], by establishing membership in $math[R] and $math[RE], respectively. We start with a simple proposition: ==== - Proposition ==== $math[R \subsetneq RE] //Proof:// $math[R \subseteq RE] is straightforward from Definition of $math[R] and $math[RE] from Definition of $math[R] and $math[RE]. Let $math[f \in R], and $math[M_f] be the $math[TM] which decides $math[M_f]. We build the $math[TM] $math[M^\prime] such that $math[M^\prime(w)=1] iff $math[M_f(w)=1] and $math[M^\prime(w)=\perp] iff $math[M_f(w)=0]. $math[M^\prime] simulates $math[M] but enters into an infinite loop whenever $math[M_f(w)=0]. $math[M^\prime] accepts $math[f] hence $math[f \in RE]. $math[R \neq RE] has already been shown by Propositions 1.4.3 and 1.4.4. $math[f_h \in RE] but $math[f_h \notin R]. Thus, $math[R] and $math[RE] should be interpreted as a "//scale//" for solvability: $math[R] membership is complete solvability, $math[RE] membership is partial solvability, while non-membership in $math[RE] is "//complete//" unsolvability. ==== 1.4.1 Remark ==== //We note that $math[R] and $math[RE] are not the only sets of functions which are used in Computability Theory. It has been shown that there are //"degrees"// of unsolvability, of //"higher level"// than $math[R] and $math[RE]. These degrees are intuitively obtained as follows: We assume we live in a world where $math[f_h] is decidable (recursive). Now, as before, we ask which problems are recursive and which are recursively-enumerable. It turns out that, also in this ideal case, there still exist recursive and recursively-enumerable problems, as well as some which are neither. This could be imagined as //"undecidability level 1"//. Now, we take some problem which is in $math[RE] on level 1, and repeat the same assumption: it is decidable. Again, under this assumption, we find problems in $math[R], $math[RE] and outside the two, which form up //"undecidability level 2"//. This process can be repeated// ad infi nitum. Returning to our simpler classi cation, we must observe an interesting feature of recursively-enumerable functions, which is also the reason they are called this way. ==== - Proposition ==== //A function $math[f \in Hom(\mathbb{N}, \{0, 1\})] is recursively enumerable iff there exists a Turing Machine which can **enumerate/generate** all elements in $math[A_f = \{w \in \mathbb{N} \mid f(n^w) = 1\}]. Intuitively, $math[A_f] is the set of inputs of $math[f] for which the answer at hand is yes.// //Proof:// $math[\Longrightarrow] Suppose $math[f] is recursively-enumerable and $math[M] accepts $math[f]. We write $math[w_i] to refer to the //i//th word from $math[\Sigma^*]. We specify the $math[TM] generating $math[A_f] by the following pseudocode: $algorithm[$math[GEN()]] $math[\mbox{static } A_f = \emptyset \mbox{;}] $math[k=0 \mbox{;}] $math[\mathbf{while} \mbox{ } \mathit{True} \mbox{ } \mathbf{do}] $math[\quad \mathbf{for} \mbox{ } 0 \leq i \leq k \mbox{ } \mathbf{do}] $math[\quad \quad \mbox{run } M(w_i) \mbox{;}] $math[\quad \quad \mathbf{if} \mbox{ } M(w_i) \mbox{ } halts \mbox{ } before \mbox{ } k \mbox{ } steps \mbox{ } and \mbox{ } i \notin A_f \mbox{ } \mathbf{then}] $math[\quad \quad \quad A_f = A_f \cup \{ w_i \};] $math[\quad \quad \quad \mathbf{return} \mbox{ } w_i;] $math[\quad \quad \mathbf{end}] $math[\quad \mathbf{end}] $math[\quad k=k+1;] $math[\mathbf{end}] $end The value of $math[k] from the **for** has a two-fold usage. First, it is used to explore all inputs $math[w_i : 0 \leq i \leq k]. Second, it is used as a time-limit for $math[M]. For each $math[w_i] we run $math[M(w_i)], for precisely $math[k] steps. If $math[M(w_i) = 1] in at most $math[k] steps, then $math[w_i] is added to $math[A_f], and then returned (written on the tape). Also, $math[w_i] is stored for a future execution of $math[GEN]. If $math[M(w_i) = 1] for some $math[w_i], then there must exist a $math[k : k \geq i] such that $math[M(w_i)] halts after $math[k] steps. Thus, such a $math[k] will eventually be reached. $math[\Longleftarrow] Assume we have the Turing Machine $math[GEN] which generates $math[A_f]. We construct a Turing Machine $math[M] which accepts $math[f]. $math[M] works as follows: $algorithm[$math[M(w)]] $math[A_f=\emptyset, \mbox{ } n=0;] $math[\mathbf{while} \mbox{ } w \notin A_f \mbox{ } \mathbf{do}] $math[\quad v=GEN();] $math[\quad A_f = A_f \cup \{ v \};] $math[\mathbf{end}] $math[\mathbf{return} \mbox{ } 1;] $end $math[M] simply uses $math[GEN] to generate elements in $math[A_f] . If $math[w \in A_f] , if will eventually be generated, and $math[M] will output $math[1]. Otherwise $math[M] will loop. Thus $math[M] accepts $math[f]. Proposition 1.4.6 is useful since, in many cases, it is easier to find a generator for $math[f], instead of a Turing Machine which accepts $math[f]. Finally, in what follows, we shall take a few decision problems, and apply a //reduction// technique, in order to prove they are not decidable. **Halting on all inputs** Let: $math[f_{all}(n^{enc(M)}) = \left\{ \begin{array}{ll} 1 & \mbox{iff } M \mbox{ halts for all inputs} \\ 0 & \mbox{otherwise} \end{array} \right.] The technique we use to show $math[f_{all} \notin R] is called a //reduction// (from $math[f_h]). It proceeds as follows. We assume $math[f_{all} \in R]. Starting from the $math[TM \mbox{ } M_{all}] which decides $math[f_{all}] we built a $math[TM] which decides $math[f_h]. Thus, if $math[f_{all}] is decidable then $math[f_h] is decidable, which leads to contradiction. First, for each fixed $math[TM \mbox{ } M] and fi xed input $math[w \in \Sigma^*], we build the $math[TM \mbox{ } \Pi_{M,w}(\omega) = ]//"Replace $math[\omega] by $math[w] and then simulate $math[M(w)]"//. It is easy to see that $math[(\forall \omega \in \Sigma^* : \Pi_{M,w}(\omega) \mbox{ halts})] iff $math[M(w) ] halts. Now, we build the $math[TM \mbox{ } M_h] which decides $math[f_h]. The input of $math[M_h] is $math[enc(M)\_w]. We construct $math[\Pi_{M,w}] and run $math[M_{all}(enc(\Pi_{M,w}))]. By assumption $math[M_{all}] must always halt. If the output is $math[1], then $math[\Pi_{M,w}(\omega)] halts for all inputs, hence $math[M(w)] halts. We output $math[1]. If the output is $math[0], then $math[\Pi_{M,w}(\omega)] does not halt for all inputs, hence $math[M(w)] does not halt. We output $math[0]. We have built a reduction from $math[f_{all}] to $math[f_h]: Using the $math[TM] which decides $math[f_{all}] we have constructed a machine which decides $math[f_h]. Since $math[f_h] is not recursive, we obtain a contradiction. $def[Turing-reducibility] //Let $math[f_A, f_B \in Hom(\mathbb{N}, \{0, 1\})]. We say $math[f_A] is// Turing reducible //to $math[f_B], and write $math[f_A \leq_T f_B] iff there exists a// decidable transformation// $math[T \in Hom(\mathbb{N},\mathbb{N})] such that $math[f_A(n) = 1] iff $math[f_B(T(n)) = 1].// $end ==== 1.4.2 Remark (Reducibility) ==== //We note that the transformation $math[T] must be computable, in the sense that a Turing Machine should be able to compute $math[T], for any possible valid input. When proving $math[f_{halt} \leq_T f_{all}], we have taken $math[n^{enc(M)\_w} \mbox{ }---] an instance of $math[f_h] and shown that it could be transformed into $math[n^{enc(\Pi_{M,w})} \mbox{ }---] an instance of $math[f_{all}], such that $math[f_h(n^{enc(M)\_w}) = 1] iff $math[f_{all}(n^{enc(\Pi_{M,w})}) = 1]. A Turing Machine can easily perform the transformation of $math[enc(M)\_w] to $math[enc(\Pi_{M,w})] since it involves adding some states and transitions which precede the start-state of $math[M], hence $math[T] is computable.// **Halting on 111** $math[f_{111}(n^{enc(M)}) = \left\{ \begin{array}{ll} 1 & \mbox{iff } M(111) \mbox{ halts} \\ 0 & \mbox{otherwise} \end{array} \right.] We reduce $math[f_h] to $math[f_{111}]. Assume $math[M_{111}] decides $math[f_{111}]. Given a Turing Machine $math[M] and word $math[w], we construct the machine: $math[\Pi_{M,w}(\omega) = \mbox{ if } \omega = 111 \mbox{ then } M(w) \mbox{ else loop }] We observe that (i) the transformation from $math[enc(M)\_w] to $math[enc(\Pi_{M,w})] is decidable since it involves adding precisely three states to $math[M]: these states check the input $math[\omega], and if it is $math[111] - replace it with $math[w] and run $math[M]; (ii) $math[M_{111}(\Pi_{M,w}) = 1 \mbox{ iff } M(w) \mbox{halts}]. The reduction is complete. $math[f_{111} \notin R]. **Halting on some input** We define: $math[f_{any}(n^{enc(M)}) = \left\{ \begin{array}{ll} 1 & \mbox{iff } M(w) \mbox{ halts for some } w \in \Sigma^* \\ 0 & \mbox{otherwise} \end{array} \right.] We reduce $math[f_{111}] to $math[f_{any}]. We assume $math[f_{any}] is decided by $math[M_{any}]. We construct: $math[\Pi_M(\omega) = \mbox{ Replace } \omega \mbox{ by } 111 \mbox{ and } M(111)] Now, $math[M_{any}(enc(\Pi_M)) = 1 \mbox{ iff } M(111) = 1], hence we can use $math[M_{any}] to build a machine which decides $math[f_{111}]. Contradiction $math[f_{any} \notin R]. **Machine halt equivalence** We define: $math[f_{eq}(n^{enc(M_1)\_enc(M_2)}) = \left\{ \begin{array}{ll} 1 & \mbox{for all } w \in \Sigma^* \mbox{ : } M_1(w) \mbox{ halts iff } M_2(w) \mbox{ halts} \\ 0 & \mbox{otherwise} \end{array} \right.] We reduce $math[f_{all}] to $math[f_{eq}]. Let $math[M_{triv}] be a one-state Turing Machine which halts on every input, and $math[M_{eq}] be the Turing Machine which decides $math[f_{eq}]. Then $math[M_{eq}(enc(M)\_enc(M_{triv})) = 1 \mbox{ iff } M] halts on all inputs. We have shown we can use $math[M_{eq}] in order to build a machine which decides $math[f_{all}]. Contradiction. $math[f_{eq} \notin R]. So far, we have used reductions in order to establish problem nonmembership in $math[R]. There are other properties of $math[R] and $math[RE] which can be of use for this task. First we defi ne: $def[Complement of a problem] //Let $math[f \in Hom(\mathbb{N},\{0,1\})]. We denote by $math[\overline{f}] the problem:// $math[\overline{f}(n) = \left\{ \begin{array}{ll} 1 & \mbox{iff } f(n)=0\\0 & \mbox{iff } f(n)=1 \end{array} \right.] //We call $math[\overline{f}] the// complement //of $math[f].// $end For instance, the complement of $math[f_h] is the problem which asks if a Turing Machine $math[M] does not halt for input $math[w]. We also note that $math[\overline{\overline{f}} = f]. Next, we defi ne the class: $math[coRE = \{f \in Hom(\mathbb{N}, \{0; 1\}) \mid \overline{f} \in RE\}] $math[coRE] contains the set of all problems whose complement is in $math[RE]. We establish that: ==== - Proposition ==== $math[RE \cap coRE = R]. //Proof:// Assume $math[f \in RE \cap coRE]. Hence, there exists a Turing Machine $math[M] which accepts $math[f] and a Turing Machine $math[\overline{M}] which accepts $math[\overline{f}]. We build the Turing Machine: $math[M^*(w) = \mbox{ for } i \in \mathbb{N} \left\{ \begin{array}{ll} \mbox{ run } M(w) \mbox{ for } i \mbox{ steps.}\\ \mbox{ If } M(w) = 1 \mbox{, return } 1 \mbox{. Otherwise:}\\ \mbox{ run } \overline{M}(w) \mbox{ for } i \mbox{ steps.}\\ \mbox{ If } \overline{M}(w) = 1 \mbox{, return } 0 \mbox{.} \end{array} \right.] Since $math[M] and $math[\overline{M}] will always halt when expected result is 1, they can be used together to decide $math[f]. Hence $math[f \in R]. ==== - Proposition ==== $math[f \in R \mbox{ iff } \overline{f} \in R \mbox{.}] //Proof:// The proposition follows immediately since the Turing Machine which decides $math[f] can be used to decide $math[\overline{f}], by simply switching it's output from $math[0] to $math[1] and $math[1] to $math[0]. The same holds for the other direction. We conclude this chapter with a very powerful result which states that an //category/type// of problems does not belong in $math[R]. ==== 1.4.1 Theorem (Rice) ==== //Let $math[\mathcal{C} \subseteq RE]. Given a Turing Machine $math[M], we ask:// "The problem accepted by $math[M] is in $math[\mathcal{C}]?". //Answering this question is not in $math[R].// //Proof:// We consider that the trivial problem $math[f(n) = 0] is not in $math[\mathcal{C}]. Since $math[\mathcal{C}] is non-empty, suppose $math[f^* \in \mathcal{C}], and since $math[f^*] is recursively-enumerable, let $math[M^*] be the Turing Machine which accepts $math[f^*]. We apply a reduction from a variant of $math[f_{111}], namely $math[f_x]. $math[f_x] asks if a Turing Machine halts for input $math[x]. We assume we can decide the membership $math[f \in \mathcal{C}] by some Turing Machine. Based on the latter, we construct a Turing Machine which decides $math[f_x] (i.e. solves the halting problem for a particular input). Let $math[M_x] be the Turing Machine which //accepts// $math[fx]. Let: $math[\Pi_w(\omega) = \mbox{ if } M_x(w) \mbox{ then } M^*(\omega) \mbox{ else loop.}] If $math[f_{\Pi_w}] is the problem accepted by $math[\Pi_w], we show that: $math[f_{\Pi_w} \in \mathcal{C} \mbox{ iff } M_x(w) \mbox{ halts}] $math[(\Rightarrow)]. Suppose $math[f_{\Pi_w} \in \mathcal{C}]. Then $math[\Pi_w(\omega)] cannot loop for every input $math[\omega \in \Sigma^*]. If there were so, then $math[f_{\Pi_w}] would be the trivial function always returning $math[0] for any input, which we have assumed is not in $math[\mathcal{C}]. Thus, $math[M_x(w)] halts. $math[(\Leftarrow)]. Suppose $math[M_x(w)] halts. Then the behaviour of $math[\Pi_w(\omega)] is precisely that of $math[M^*(\omega)]. $math[\Pi_w(\omega)] will return $math[1] whenever $math[M^*(\omega)] will return $math[1] and $math[\Pi_w(\omega) = \perp] whenever $math[M^*(\omega) = \perp]. Since $math[f \in \mathcal{C}], then also $math[f_{\Pi_w} \in \mathcal{C}]. In Theorem 1.4.1, the set $math[\mathcal{C}] should be interpreted as a //property// of problems, and subsequently of the Turing Machines which accept them. Checking if some Turing Machine $math[M^*] satisfi es the given property is undecidable. Consider the property informally described as: //The set of Turing Machines(/computer programs) that behave as viruses.// The ability of deciding whether a Turing Machine behaves as a virus (i.e. belongs to the former set) is undecidable, via Rice's Theorem.