Complexity Classes
Preliminaries
We define:
$ DTIME(T(n))=\{ f : \mathbb{N} \rightarrow \{0,1\} \mid f \mbox{ is decidable in time } O(T(n))\}$
and
$ NTIME(T(n))=\{ f:\mathbb{N} \rightarrow \{0,1\} \mid f \mbox{ is decidable by a } NTM \mbox{ in time } O(T(n))\}$
$ DTIME(T(n))$ is the class of problems which are solvable by a Turing Machine having execution time $ T(n)$ , while $ NTIME(T(n))$ is the class of problems which are solvable by a nondeterministic Turing Machine having execution time $ T(n)$ .
These two auxiliary definition allow us to define:
$ P = \displaystyle \bigcup\limits_{d \in \mathbb{N}} DTIME(n^d)$
$ NP = \displaystyle \bigcup\limits_{d \in \mathbb{N}} NTIME(n^d)$
and
$ EXPTIME = \displaystyle \bigcup\limits_{d \in \mathbb{N}} DTIME\left(2^{n^d}\right)$
- $ P$ is the class of problems which are decidable in polynomial time by a Turing Machine;
- $ NP$ is the class of problems which are decidable in polynomial time by a non-deterministic Turing Machine; we also call this, the class of problems decidable in nondeterministic polynomial time;
- $ EXPTIME$ is the class of problems which are decidable in exponential time by a Turing Machine;
Again, we note that our classification is modest (the classes above do not distinguish between e.g. problems solvable in linear vs quadratic time). A more refined classification goes outside the scope of this lecture.
The relationship between P, NP and EXPTIME
We start with the following observation:
$ P \subseteq NP \subseteq EXPTIME \subseteq R$
- the first inclusion follows from the definition of the nondeterministic Turing Machine, which extends that of the Turing Machine. Hence, if a problem is solvable in polynomial time, it is trivially solvable in nondeterministic polynomial time
- the second inclusion follows from the Proposition shown in the previous lecture: for any NTM running in polynomial time, we have an equivalent TM running in exponential time. Hence, any problem which is solvable in nondeterministic polynomial time, it is also solvable in exponential time
- the last inclusion follows from the definition of $ EXPTIME$ , which only contains decidable problems.
The challenging and insightful questions are:
- which of the following inclusions are strict ?
Hard and complete problems
Let us for now focus on the following inclusion:
$ P \subseteq NP$
The strictness of this inclusion remains an open problem in Computer Science, and may be one of the major statements of Science which are yet to be proved. More precisely, there are no known proofs for showing that some problem $ f$ satisfies the statement $ f\not\in P$ .
While no formal proof for $ P \subsetneq NP$ exists, computer scientists have made assumptions regarding this statements, which is generally believed to be true, or at least true to the extent to which we understand computation (as Turing Machines);
This belief is obviously based on some evidence. There exist problems such as $ SAT$ and $ k-Vertex-Cover$ which belong to $ NP$ , but for which no polynomial algorithm has been found.
So, the question is, what is the common property which all difficult problems from $ NP$ (such as $ SAT$ and $[k-Vertex-Cover]) have?
Hardness
The formal concept of hardness which we introduce serves a twofold purpose:
- relate problems in terms of difficulty in a rigorous way
- establish a relaxed form of non-membership ($ \not\in P$ ) on which we can give proofs (true/false statements instead of beliefs)
Definition (Polynomial reduction):
Let $ f,f'$ be two decision problems. We write $ f' \leq_p f$ (read: $ f'$ is polynomial time reducible to $ f$ ) iff:
$ f \leq_T f'$ ($ f'$ is Turing reducible to $ f$ )
the transformation $ T$ is computable in polynomial time
Proposition:
If $ f \in P$ and $ f' \leq_p f$ then $ f' \in P$
Proof:
Suppose $ f \in P$ . Hence there exists a TM $ M_f$ which decides $ f$ in polynomial time $ S_f(n)$ . Since $ f' \leq_p f$ , there exists a TM $ T$ which transforms the input of $ f'$ into one of $ f$ , such that: $ \forall w: f'(w) = 1 \iff f(T(w)) = 1$ . Furthermore, the execution time of $ T$ is a polynomial function $ S_T(n)$ . We construct the Turing Machine $ M^*$ as follows:
read the input $ w$
compute $ w' = T(w)$
execute $ M_f(w')$
Then $ M^*$ decides $ f'$ . Moreover, the execution time of $ M*$ is $ O(S_f(n) + S_T(n))$ . Hence $ f'$ is solvable in polynomial time.
Proposition:
If $ f' \not\in P$ and $ f' \leq_p f$ then $ f \not\in P$ .
Proof:
Suppose $ f' \not\in P$ and $ f \in P$ . Since $ f' \leq_p f$ , via the above proposition, it follows that $ f' \in P$ . Contradiction.
Note that the above proposition establishes an implication ($ A \implies B$ ). We cannot practically benefit from this proposition unless we know $ A$ is true (which, as already said, has not been proved for any problem).
However, this proposition is still true, because it validates our intuition regarding the concept of polynomial reduction - it clearly reflects our intuition regarding hardness.
Hardness with respect to a class
We now extend the concept of hardness:
Definition (Hardness w.r.t. a class):
Let $ f$ be a problem and $ X$ be a complexity class among: $ NP$ and $ EXPTIME$ . We say $ f$ is X-hard iff $ \forall f' \in X$ we have $ f' \leq_p f$ .
Proposition:
Let $ f$ be an NP-hard problem. If $ f\in P$ then $ P = NP$ .
Proof:
Suppose $ f$ is NP-hard, hence $ \forall f' \in NP$ we have that $ f' \leq_p f$ . Also, since $ f\in P$ , let $ M_f$ be the TM which decides $ f$ in polynomial time. As before, we can construct a Turing Machine which which decides $ f'$ , by: (i) executing the transformation $ T$ on the input, and (ii) running $ M_f$ on the transformed input. Since we can decide $ f'$ in polynomial time, for all $ f' \in NP$ , it follows that $ P = NP$ .
Proposition:
Let $ f$ be an NP-hard problem. If $ P \neq NP$ then $ f \not \in P$ .
Proof:
Suppose $ P \neq NP$ and let $ f$ be an NP-hard problem. If $ f \in P$ , it follows by the above Proposition that $ P = NP$ , which contradicts our assumption.
Remarks:
- An NP-hard problem captures the difficulty of the entire class NP: each instance of each problem in NP, can be solved by some instance of the NP-hard problem;
- The concept of NP-hardness relaxes the notion of non-membership ($ \not\in P$ ), and bypasses our lack of knowledge regarding the $ P = NP$ problem. If $ P \neq NP$ as it is generally believed, then NP-hardness directly implies non-membership in P.
Remark (Hardness):
The concept of hardness is also technically interesting in the broader scope. It circumvents our lack of knowledge regarding $ P = NP$ by taking a statement of the form $ f \not \in P$ (which we cannot prove), and transforming it to a statement if $ P \neq NP$ then $ f \not \in P$ . In essence, proving that a problem is NP-hard means showing an implication of the form if $ P \neq NP$ then the problem at hand cannot be solved in polynomial time
Completeness with respect to a class
Definition (Completeness w.r.t. a class):
Let $ f$ be a problem and $ X$ be a complexity class among: $ NP$ and $ EXPTIME$ . We say $ f$ is X-complete iff:
$ f$ is X-hard
$ f \in X$
While NP-hard problems intuitively characterise problems which are at least as hard as any problem in NP, NP-complete problems are the hardest problems in NP.
In general:
- a hardness result establishes an upper-bound or (pseudo-)impossibility result
- a membership result establishes a lower-bound - it shows that a problem can be solved with (at least) some computational effort.
Finally, a completeness result combines the two - completely characterising the difficulty of a problem.
Proposition:
$ \leq_p$ is reflexive and transitive.
Proposition:
The set of $ NP$ -hard problems is closed under $ \leq_p$ .
Proposition:
The set of $ NP$ -complete problems together with $ \leq_p$ is an equivalence class.
We leave the proofs of these simple propositions as exercise.
Is our classification system consistent ?
In what follows, we shall establish:
if there exist NP-hard and NP-complete problems
- how to prove that a problem is NP-hard / NP-complete
Proposition (SAT):
The problem $ SAT$ is NP-complete.
We postpone the proof for the future lecture.
How to prove a problem is NP-hard / complete
Proposition:
Suppose $ f'$ is an NP-hard problem. If $ f' \leq_p f$ then $ f$ is NP-hard.
Proof:
Let $ f'$ be NP-hard, hence $ \forall f'' \in NP$ we have that $ f'' \leq_p f'$ , hence there exists a transformation $ T_1$ such that: (i) $ \forall w: f''(w) = 1 \iff f'(T_1(w)) = 1$ and (ii) $ T_1$ runs in polynomial time.
Since $ f' \leq_p f$ , let $ T_2$ be the witnessing transformation. We can now build a transformation $ T(w) = T_2(T_1(w))$ such that:
$ \forall w: f''(w) = 1 \iff f(T_2(T_1(w))) = 1$
$ T$ runs in polynomial time since it consists of two sequential polynomial time-transformations
$ T$ is a witness that $ \forall f'' \in NP$ we have $ f'' \leq_p f$ , hence $ f$ is NP-hard.
How to show a problem $ f$ is NP-hard:
- Select a problem $ f'$ which is known to be NP-hard. Such a problem must exist. We can choose SAT if no other problem is known.
- Find a polynomial time-transformation such that $ f' \leq f$ .
Reasons to believe P =\= NP
The $ P=NP$ issue can also be given another intuitive interpretation, if we recall the concept of nondeterminism:
- “The verification of a solution candidate is as difficult as generating it” or, alternatively: “Verifying a given proof $ P$ for $ A$ , is as difficult as finding a proof for $ P$ ”. Can you see why this is the case?
Finally, to better understand why it is generally believed that $ P\neq NP$ , let us contemplate the consequences of $ P = NP$ :
- Partial program correctness can be solved efficiently. Techniques such as model checking can be applied without hitting the state-explosion problem, to a wide range of applications (including operating system kernels). Bugs are almost removed.
- Mathematical proofs can be generated efficiently. Computers can be used to find proofs for some open problems.
- We can use brute-force i.e. exponential search algorithms to find passwords, or to break encryption keys in polynomial time.
- Internet privacy is no longer possible using encryption (e.g. using SSH).
- Internet commerce and banking is no longer possible.
- Safe communication is no longer possible (at all levels).
Another application of NP-hardness
Reductions $ \leq_p$ are a theoretical tool to prove $ NP$ -hardness. Reductions also have practical applications. For instance, most $ NP$ -complete problems are solved by employing $ SAT$ solvers, which, as discussed in the former chapters, may be quite fast in general case. Thus, a specific problem instance is cast (via an appropriate transformation) into a formula $ \varphi$ , such that $ \varphi$ is satisfiable iff the answer to the instance is yes.
Beyond NP-hardness
In this lecture, we study NP-hard and NP-complete problems, using the aforementioned reduction technique. The same technique can be adjusted to explore completeness with respect to other classes, for instance $ P$ and $ EXPTIME$ , with slight modifications on the condition added to $ T$ .
Can you identify the modification required for the definition of $ P$ -hardness ?