====== Nondeterminism ======

===== Intuition =====

In the figure below, we show an ASCII map:
<code> 
       ---- C3 --- dst --- C4 --- cliff
      /     |              |
     /      |              |
  src  >>-- C1 ----------- C2 --- cliff
</code>

Suppose a robot starts from position ''src'', and would like to reach ''dst''. On the map we have four crossroads denoted by ''Ci'' with ''i'' from 1 to 4.

Now, suppose the robot starts in the direction indicated by ''>>'' but had a temporary sensor damage, and **does not know if he missed the left turn**. In fact, the robot has **imperfect information** regarding his current state: he might be at point ''C1'' as well as at point ''C2'' in the map.

To model **imperfect information**, we consider that the current position of the robot is ''{C1,C2}'':
  * one could read this as //the robot is in both ''C1'' and ''C2'' **at the same time**// or,
  * as //the robot **cannot distinguish** ''C1'' from ''C2'', thus he considers them **equally likely**//

What happens after the robot takes the left-turn?
  * if the current position is ''C1'', then the robot is now at ''C3''
  * if the current position is ''C2'', then the robot is now at ''C4''

As before, after the left-turn, the robot cannot distinguish ''C3'' from ''C4'', thus his current position is ''{C3,C4}''. We can represent the **view** which the robot has of the map after the sensor damage, as a **finite-state machine**:

^ Action\State   ^ q0 (initial state)^ C1 ^ C2    ^ C3   ^ C4     |
| straight    | C1, C2               | C2 | cliff | -    | -  |
| left        | -                    | C3 | C4    | src  | dst    |
| right       | -                    | -  | -     | dst  | cliff  |

The matrix-like specification of the state-machine is very similar to that of the Turing Machine:
  * **states** correspond to positions
  * **symbols** correspond to actions which the robot can take
  * **transitions** correspond to robot movements
  * **the tape** contains a **plan** (e.g. //straight,left,right//) which may or may not end in a final state.
  * we may view **cliff** as a **no-state** (the robot falls from a cliff), and **dst** - as a final state.
  * there is no concept of **writing on the tape**, and the head always moves to the right.

Note however that, when the actions ''straight'' has been read (i.e. executed), the state-machine can **nondeterministically** move to both ''C1'' and ''C2''. 

Formally, the transition function $math[\delta] can **no longer** be a function, as it requires assigning two different next-states when ''straight'' is read in state $math[q_0].

Questions:
  * when the state-machine reaches a **final-state**, what does this say about the plan?

$remark[state-machines]
A state-machine, as informally described above is a **non-deterministic automaton**. These computing devices will be studied in more details in the Formal Languages and Automata Lecture.
$end

===== Motivation =====

In the previous section, we have informally introduced the concept of **nondeterminism** as **imperfect-information**, to make it more illustrative and easy-to-understand.

Next, we need to explore nondeterminism from a **computational perspective**, and to see how this concept is useful for classifying problems according to their hardness.

==== Boolean Satisfiability (SAT) ====

In this section, we introduce the Boolean Satisfiability Problem, and use it as a motivating example to drive our discussion. 

$math[SAT] is the problem of determining if there exists an interpretation that satisfies a given Boolean formula.

$math[SAT] takes as input a boolean formula $math[\psi] in Conjunctive Normal Form (CNF):

$math[\psi = C_1 \wedge C_2 \wedge \ldots \wedge C_n]

where, for each $math[i:1 \leq i \leq n] we have

$math[C_i = L_{i1} \vee L_{i2} \vee \ldots \vee L_{im_i}]

and, for each $math[j:1 \leq j \leq m_i] we have

$math[L_{ij} = x] or $math[L_{ij}=\neg x]

where $math[x] is a variable, and outputs $math[1] iff there exists a interpretation $math[I], such that under interpretation $math[I] the formula $math[\psi] is true.

An interpretation is a function that maps each variable(propositional symbol) to one of the truth values true and false.


==== SAT can be solved in exponential time ====
How can we solve SAT?

Given a formula $math[\psi], we take the following steps:
    * 1. Compute the number of variables from $math[\psi] (henceforth referred to as $math[n]) | Requires polynomial time
    * 2. We generate every interpretation for the variables in $math[\psi] | We have $math[2^n] interpretations possible
    * 3. We check every interpretation against $math[\psi]. | Requires polynomial time $math[O(n*k)], where $math[k] is the number of clauses in $math[\psi]

Therefore $math[SAT] can be solved in exponential time, hence $math[SAT \in EXPTIME]. The major source of complexity consists in generating all possible interpretations on which a verification is subsequently done. 

Let's assume that $math[SAT] would be solvable in polynomial time. Could we find problems (possibly related to $math[SAT]) which are still solvable in exponential time under our assumption? The answer is yes: Let $math[\gamma] be the formula:

$math[\gamma = \forall x_1 \forall x_2 \ldots \forall x_k \psi]

where $math[\psi] is a formula in CNF containing variables $math[x_1,\ldots,x_k], and $math[k \in \mathbb{N}] is arbitrarly fixed. Checking if $math[\gamma] is satisfiable is the problem $math[\forall SAT]. An algorithm for $math[\forall SAT] must build //all combinations of 0/1 values for each $math[x_i] with $math[i:1 \leq i \leq k]// and for each one, must solve an instance of the $math[SAT] problem. In total, we have $math[2^k] combinations, and since $math[k] is part of the input, the algorithm runs in exponential time, //provided that we have an algorithm for $math[SAT] which runs in polynomial time//.


===== The Nondeterministic Turing Machine =====

$def[Non-deterministic TM]
//A non-deterministic Turing Machine ($math[NTM] short) is a tuple $math[M=(K,F,\Sigma,\delta,s_0)] over alphabet $math[\Sigma] with $math[K], $math[\Sigma] and $math[s_0] defined as for (conventional) Turing Machines, $math[F=\{s_{yes},s_{no}\}] and $math[\delta \subseteq K \times \Sigma \times K \times \Sigma \times \{L,H,R\}] is a **transition relation**.//

A $math[NTM] **terminates** iff it reaches a final state. A $math[NTM] $math[M] **decides** a function $math[f:\mathbb{N} \rightarrow \{0,1\}] iff
  * $math[f(n^w)=0 \Longrightarrow M(w)] reaches state $math[s_{no}] **on all possible sequences of transitions** and     
  * $math[f(n^w)=1 \Longrightarrow M(w)] reaches state $math[s_{yes}] **on at least one sequence of transitions**.

We say the **running time** of a $math[NTM] $math[M] is $math[T] iff, for all $math[w \in \Sigma]:
  * the **shortest successful sequence** of transitions has at most $math[T(\mid w\mid)] steps - **if it exists** or,
  * the **longest unsuccessful sequence** of transitions has at most $math[T(\mid w\mid)] steps. 

$end

We start with a few technical observations. First note that the $math[NTM] is specifically tailored for decision problems. It has only two final states, which correspond to //yes/no// answers. Also, the machine does not produce an output, and the usage of the tape is merely for internal computations. In essence, these "design choices" for the $math[NTM] are purely for convenience, and alternatives are possible.

Whereas the conventional Turing Machine assigned, for each combination of state and symbol, a unique next-state, overriding symbol and head movement, a nondeterministic machine assigns a //collection// of such elements.

  * The //current configuration// of a conventional Turing Machine was characterized by the current contents of the tape, and by the head position. 
  * A configuration of the nodeterministic machine corresponds to a //set// of conventional $math[TM] configurations. 

The intuition is that the $math[NTM] can simultaneously process a set of conventional configurations, in one single step. While the execution of a Turing Machine can be represented as a //sequence//, that of the $math[NTM] can be represented as a //tree//. A path in the tree corresponds to one sequence of transions which the $math[NTM] performs.

Now, notice the conditions under which a $math[NTM] decides a function: if at least one sequence of transitions leads to $math[s_{yes}], we can interpret the answer of the $math[NTM] as //yes//. Conversely, if **all** sequences of transitions lead to $math[s_{no}], then the machine returns //no//.

Finally, when accounting for the running time of a $math[NTM], we do not count all performed transitions (as it would seem reasonable), but only the //length of the longest transition sequence// performed by the machine. The intuition is that all members of the current configuration are processed //in parallel//, during a single step.

===== SAT is Solvable in polynomial time on a NTM =====


We build the $math[NTM] $math[M_{SAT}] which solves the $math[SAT] problem discussed previously. First, we assume the existence of $math[M_{chk}(I,\psi)] which takes an interpretation and a formula, both encoded as a unique string, and checks if $math[I \models \psi]. $math[M_{chk}] is a conventional $math[TM], thus upon termination it leaves $math[0] or $math[1] on the tape.

  * Step 1: $math[M_{SAT}] computes the number of variables from $math[\psi] (henceforth referred to as $math[n]), and pretends the encoding of $math[\psi] with the encoding of $math[n] in unary (as a sequence of $math[1]'s). This step takes $math[O(n)] transitions.
  * Step 2: During the former step, $math[M_{SAT}] has created a context for generating interpretations. In this step, $math[M_{SAT}] goes over each cell from the encoding of $math[n], and non-deterministically places $math[0] or $math[1] in that cell. Thus, after 1 such transition, there are 2 possible conventional configurations. In the first, bit $math[0] is placed on the first cell of the encoding of $math[n]. In the second, bit $math[1] is placed in the same position. After $math[i] transitions, we have $math[2^i] possible configurations. At the end of this step, we have $math[2^n] possible configurations, and in each one, we have a binary word of length $math[n] at the beginning of the tape, which corresponds to one possible interpretation. All sequences of transitions have the same length, thus, the execution of this part of $math[M_{SAT}] takes $math[O(n)].
  * Step 3: At the end of each sequence illustrated above, we run $math[M_{chk}(I,\psi)], where $math[I] and $math[\psi] are already conveniently on the tape. This step takes $math[O(n*k)] where $math[k] is the number of clauses in $math[\psi].

If we add up all running times of the three steps, we obtain $math[O(n*k)].

===== Conventional and Nondeterministic TMs - a qualitative comparison =====

NTMs definitely seem faster thus better at solving problems than their conventional counterparts. However:
  * just like TMs, NTMs are formal models - this time for **algorithms with embedded paralelism**
  * NTMs cannot be implemented in practice (they can only be implemented if the //branching// of computations has an upper limit. Recall that, in SAT, each branching is done per variable, and the number of variables **depends on the input**.

However, are NTMs more expressive - e.g. able to solve problems which are **not recursive**?
The following proposition answers this question:


$justprop
Every function which is decidable by an $math[NTM] in polynomial running time, is also by a $math[TM] which runs in exponential time.
$end

Can you think of a way to prove the proposition?

===== The role of NTMs =====

We have already seen:
  * the formal definition of NTMs (extensions of TMs where the transition function becomes a **relation**)
  * the intuition underlying them
  * the notion of acceptance and execution time of NTMs
  * the relation to TMs in terms of expressive power

However, the usefulness of NTMs may still be unclear. To convincingly motivate NTMs, let us recall the notion of **acceptance** of a Turing Machine:
  * although not a practical way of solving a problem - acceptance serves the purpose of describing (via the TM) a certain type of **problem hardness**, one in which the algorithm cannot do better than enumerate elements of an infinite set;
  * via the concept of acceptance, we are able to define a //degree of hardness// - the class RE.

The NTM serves **exactly the same kind of purpose**, regarding complexity:
  * problems which are solved by the NTM require:
    * iterating over an **exponential number of elements** (e.g. interpretations),
    * but each element can be **built incrementally**, in a //tree-like// fashion, just like SAT interpretations, in polynomial time. This is possible by exploiting the nondeterminism;
    * each element can be verified to be a //yes/no// answer in **polynomial time** 

Thus, nondeterminism is used to identify a very specific ''pattern of complexity'', just like acceptance is used to identify a specific pattern of undecidability.

It is important to note that:
  * not all problems which are (at best) exponentially solvable exhibit this pattern:
    * it is possible that the generation of all elements is not **incremental** (see the Towers of Hanoi problem) - hence exploring all elements is no longer **nondeterministically polynomial**;
    * that the verification of a candidate is not polynomial (see extensions of SAT - QSAT).

==== Convention for describing $math[NTM] in pseudocode ====
In the previous chapters, we often resorted to traditional pseudocode in order to describe algorithms - that is, Turing Machines. It is occasionaly useful to be able to do the same thing for $math[NTM]s. With this in mind, we introduce some notational conventions. The instruction:

$math[v = choice(A)]

where $math[v] is a variable and $math[A] is a set of values, behaves as follows:
  * the current (non-deterministic) configuration of the $math[NTM] shall contain $math[\mid A \mid] conventional configuration.
  * each conventional configuration corresponds to a distinct value $math[a \in A], and it should be interpreted that $math[v=a], in that particular configuration.
  * the running time of the instruction is $math[O(1)].

We also note that, it is not possible to achieve some form of "//communication//" between conventional configurations. Thus, it is intuitive to think that the processing (execution of a transition) of a conventional configuration is done independently of all other conventional configurations.

We add two aditional instructions: **success** and **fail**. They correspond to a transitions into states $math[s_{yes}] and $math[s_{no}], respectively.

We illustrate $math[NTM] pseudocode, by re-writing the $math[SAT] algorithm described above. We adopt the same representational conventions from the first Chapter, and also re-use the procedure **CHECK**.

$example[Pseudocode]
$math[SolveSAT(\phi)]

//Let $math[n] be the number of variables in $math[\varphi].//

//Let $math[I] be a vector with $math[n] components which are initialised with $math[0].//

$math[\mathbf{for} \mbox{ } i=\overline{0,n-1} \mbox{ :}]

$math[\quad I \lbrack i \rbrack = choice(\{0,1\})]

$math[\mathbf{if} \mbox{ } CHECK(\varphi,I)=0]

$math[\quad fail]

$math[\mathbf{else} \mbox{ } succes]
$end