====== Complexity Classes ====== ===== Preliminaries ===== We define: $math[DTIME(T(n))=\{ f : \mathbb{N} \rightarrow \{0,1\} \mid f \mbox{ is decidable in time } O(T(n))\}] and $math[NTIME(T(n))=\{ f:\mathbb{N} \rightarrow \{0,1\} \mid f \mbox{ is decidable by a } NTM \mbox{ in time } O(T(n))\}] $math[DTIME(T(n))] is the **class of problems** which are solvable by a Turing Machine having **execution time** $math[T(n)], while $math[NTIME(T(n))] is the class of problems which are solvable by a **nondeterministic** Turing Machine having execution time $math[T(n)]. These two auxiliary definition allow us to define: $math[P = \displaystyle \bigcup\limits_{d \in \mathbb{N}} DTIME(n^d)] $math[NP = \displaystyle \bigcup\limits_{d \in \mathbb{N}} NTIME(n^d)] and $math[EXPTIME = \displaystyle \bigcup\limits_{d \in \mathbb{N}} DTIME\left(2^{n^d}\right)] * $math[P] is the class of problems which are decidable in **polynomial time** by a Turing Machine; * $math[NP] is the class of problems which are decidable in **polynomial time** by a **non-deterministic Turing Machine**; we also call this, **the class of problems decidable in nondeterministic polynomial time**; * $math[EXPTIME] is the class of problems which are decidable in **exponential time** by a Turing Machine; Again, we note that our classification is modest (the classes above do not distinguish between e.g. problems solvable in linear vs quadratic time). A more refined classification goes outside the scope of this lecture. ===== The relationship between P, NP and EXPTIME ===== We start with the following observation: $math[P \subseteq NP \subseteq EXPTIME \subseteq R] * the first inclusion follows from the definition of the nondeterministic Turing Machine, which **extends** that of the Turing Machine. Hence, if a problem is solvable in polynomial time, it is trivially solvable in nondeterministic polynomial time * the second inclusion follows from the Proposition shown in the previous lecture: for any NTM running in polynomial time, we have an equivalent TM running in exponential time. Hence, any problem which is solvable in nondeterministic polynomial time, it is also solvable in exponential time * the last inclusion follows from the definition of $math[EXPTIME], which only contains decidable problems. The challenging and insightful questions are: * **which of the following inclusions are //strict//** ? ===== Hard and complete problems ===== Let us for now focus on the following inclusion: $math[P \subseteq NP] The strictness of this inclusion **remains an open problem in Computer Science**, and may be one of the major statements of Science which are yet to be proved. More precisely, there are no known proofs for showing that **some** problem $math[f] satisfies the statement $math[f\not\in P]. While no formal proof for $math[P \subsetneq NP] exists, computer scientists have made assumptions regarding this statements, which is generally believed to be true, or at least **true to the extent to which we understand computation** (as Turing Machines); This belief is obviously based on some evidence. There exist problems such as $math[SAT] and $math[k-Vertex-Cover] which belong to $math[NP], but for which no polynomial algorithm has been found. So, the question is, what is the **common property** which all **difficult** problems from $math[NP] (such as $math[SAT] and $[k-Vertex-Cover]) have? ==== Hardness ==== The **formal** concept of **hardness** which we introduce serves a twofold purpose: * relate problems in terms of difficulty in a rigorous way * establish a **relaxed form** of non-membership ($math[\not\in P]) on which we can give proofs (true/false statements instead of beliefs) $def[Polynomial reduction] Let $math[f,f'] be two decision problems. We write $math[f' \leq_p f] (read: $math[f'] is **polynomial time reducible** to $math[f]) iff: * $math[f \leq_T f'] ($math[f'] is **Turing reducible** to $math[f]) * the transformation $math[T] is computable in **polynomial time** $end $justprop If $math[f \in P] and $math[f' \leq_p f] then $math[f' \in P] $end $proof Suppose $math[f \in P]. Hence there exists a TM $math[M_f] which decides $math[f] in polynomial time $math[S_f(n)]. Since $math[f' \leq_p f], there exists a TM $math[T] which transforms the input of $math[f'] into one of $math[f], such that: $math[\forall w: f'(w) = 1 \iff f(T(w)) = 1]. Furthermore, the execution time of $math[T] is a polynomial function $math[S_T(n)]. We construct the Turing Machine $math[M^*] as follows: * read the input $math[w] * compute $math[w' = T(w)] * execute $math[M_f(w')] Then $math[M^*] decides $math[f']. Moreover, the execution time of $math[M*] is $math[O(S_f(n) + S_T(n))]. Hence $math[f'] is solvable in polynomial time. $end $justprop If $math[f' \not\in P] and $math[f' \leq_p f] then $math[f \not\in P]. $end $proof Suppose $math[f' \not\in P] and $math[f \in P]. Since $math[f' \leq_p f], via the above proposition, it follows that $math[f' \in P]. Contradiction. $end Note that the above proposition establishes an implication ($math[A \implies B]). We cannot practically benefit from this proposition unless we know $math[A] is true (which, as already said, has not been proved for any problem). However, this proposition is still true, because it validates our intuition regarding the concept of **polynomial reduction** - it clearly reflects our intuition regarding **hardness**. ==== Hardness with respect to a class ==== We now extend the concept of hardness: $def[Hardness w.r.t. a class] Let $math[f] be a problem and $math[X] be a complexity class among: $math[NP] and $math[EXPTIME]. We say $math[f] is **X-hard** iff $math[\forall f' \in X] we have $math[f' \leq_p f]. $end $justprop Let $math[f] be an **NP-hard** problem. If $math[f\in P] then $math[P = NP]. $end $proof Suppose $math[f] is NP-hard, hence $math[\forall f' \in NP] we have that $math[f' \leq_p f]. Also, since $math[f\in P], let $math[M_f] be the TM which decides $math[f] in polynomial time. As before, we can construct a Turing Machine which which decides $math[f'], by: (i) executing the transformation $math[T] on the input, and (ii) running $math[M_f] on the transformed input. Since we can decide $math[f'] in polynomial time, for all $math[f' \in NP], it follows that $math[P = NP]. $end $justprop Let $math[f] be an **NP-hard** problem. If $math[P \neq NP] then $math[f \not \in P]. $end $proof Suppose $math[P \neq NP] and let $math[f] be an NP-hard problem. If $math[f \in P], it follows by the above Proposition that $math[P = NP], which contradicts our assumption. $end Remarks: * An NP-hard problem **captures the difficulty** of the entire class NP: each **instance** of each **problem** in NP, can be solved by **some** instance of the NP-hard problem; * The concept of NP-hardness relaxes the notion of //non-membership// ($math[\not\in P]), and bypasses our lack of knowledge regarding the $math[P = NP] problem. If $math[P \neq NP] as it is generally believed, then NP-hardness directly implies non-membership in P. $remark[Hardness] The concept of **hardness** is also technically interesting in the broader scope. It circumvents our lack of knowledge regarding $math[P = NP] by taking a statement of the form $math[f \not \in P] (which we cannot prove), and transforming it to a statement //if $math[P \neq NP] then $math[f \not \in P]//. In essence, proving that a problem is NP-hard means showing an implication of the form //if $math[P \neq NP] then the problem at hand cannot be solved in polynomial time// $end ==== Completeness with respect to a class ==== $def[Completeness w.r.t. a class] Let $math[f] be a problem and $math[X] be a complexity class among: $math[NP] and $math[EXPTIME]. We say $math[f] is **X-complete** iff: * $math[f] is **X-hard** * $math[f \in X] $end While NP-hard problems intuitively characterise problems which are **at least as hard** as any problem in NP, **NP-complete** problems are **the hardest problems in NP**. In general: * a **hardness** result establishes an **upper-bound** or (pseudo-)impossibility result * a **membership** result establishes a **lower-bound** - it shows that a problem can be solved with (at least) some computational effort. Finally, a **completeness result** combines the two - completely characterising the difficulty of a problem. $justprop $math[\leq_p] is reflexive and transitive. $end $justprop The set of $math[NP]-hard problems is closed under $math[\leq_p]. $end $justprop The set of $math[NP]-complete problems together with $math[\leq_p] is an equivalence class. $end We leave the proofs of these simple propositions as exercise. ===== Is our classification system consistent ? ===== In what follows, we shall establish: * if there exist NP-hard and NP-complete problems * how to prove that a problem is NP-hard / NP-complete $prop[SAT] The problem $math[SAT] is NP-complete. $end We postpone the proof for the future lecture. ==== How to prove a problem is NP-hard / complete ==== $justprop Suppose $math[f'] is an NP-hard problem. If $math[f' \leq_p f] then $math[f] is NP-hard. $end $proof Let $math[f'] be NP-hard, hence $math[\forall f'' \in NP] we have that $math[f'' \leq_p f'], hence there exists a transformation $math[T_1] such that: (i) $math[\forall w: f''(w) = 1 \iff f'(T_1(w)) = 1] and (ii) $math[T_1] runs in polynomial time. Since $math[f' \leq_p f], let $math[T_2] be the witnessing transformation. We can now build a transformation $math[T(w) = T_2(T_1(w))] such that: * $math[\forall w: f''(w) = 1 \iff f(T_2(T_1(w))) = 1] * $math[T] runs in polynomial time since it consists of two sequential polynomial time-transformations $math[T] is a witness that $math[\forall f'' \in NP] we have $math[f'' \leq_p f], hence $math[f] is NP-hard. $end How to show a problem $math[f] is NP-hard: * Select a problem $math[f'] which is known to be NP-hard. Such a problem must exist. We can choose SAT if no other problem is known. * Find a polynomial time-transformation such that $math[f' \leq f]. ===== Reasons to believe P =\= NP ===== The $math[P=NP] issue can also be given another intuitive interpretation, if we recall the concept of **nondeterminism**: * "//The verification of a solution candidate is as difficult as generating it//" or, alternatively: "//Verifying a given proof $math[P] for $math[A], is as difficult as finding a proof for $math[P]//". Can you see why this is the case? Finally, to better understand why it is generally believed that $math[P\neq NP], let us contemplate the consequences of $math[P = NP]: * Partial program correctness can be solved efficiently. Techniques such as model checking can be applied without hitting the **state-explosion problem**, to a wide range of applications (including operating system kernels). Bugs are almost removed. * Mathematical proofs can be generated efficiently. Computers can be used to find proofs for some open problems. * We can use brute-force i.e. exponential search algorithms to find passwords, or to break encryption keys in polynomial time. * Internet privacy is no longer possible using encryption (e.g. using SSH). * Internet commerce and banking is no longer possible. * Safe communication is no longer possible (at all levels). ===== Another application of NP-hardness ===== Reductions $math[\leq_p] are a theoretical tool to prove $math[NP]-hardness. Reductions also have practical applications. For instance, most $math[NP]-complete problems are solved by employing $math[SAT] solvers, which, as discussed in the former chapters, may be quite fast in general case. Thus, a specific problem instance is cast (via an appropriate transformation) into a formula $math[\varphi], such that $math[\varphi] is satisfiable iff the answer to the instance is //yes//. ===== Beyond NP-hardness ===== In this lecture, we study NP-hard and NP-complete problems, using the aforementioned reduction technique. The same technique can be adjusted to explore completeness with respect to other classes, for instance $math[P] and $math[EXPTIME], with slight modifications on the condition added to $math[T]. Can you identify the modification required for the definition of $math[P]-hardness ?