====== Formal syntax, Unification, Backtracking, Cut, Negation ======

===== Prolog syntax =====

The basic programming building block in Prolog is the **clause**. To define it formally, we first introduce:

<code>
<var> ::= alphanumeric sequence starting with capital
<term> ::= <number> | <atom> | <var> | <predicate>(<term1> ..., <term_n>)  
</code>

The definition of **terms** is slightly restrictive. For instance, terms are also arithmetic expressions (e.g. ''1 + X''), list //patterns// (e.g. [1|[]]), etc. But each such term can be expressed as a **predicate**, as already illustrated for lists, in the previous lecture.

A clause is defined via the following grammar:

<code>
<clause> ::= <res> :- <goal_1>, ..., <goal_m>.
<res> ::= <predicate>(<var_1>, ... <var_k>)
<goal_i> ::= <predicate>(<var_1>, ... <var_k>) |
             <term> = <term> |                     // unification
             <var> is <term> |                     // arithmetic assignment
             <term> =:= <term> |                   // arithmetic comparison
             true |                                // default satisfying goal
             fail |                                // default-failing goal
             ! |                                   // cut (discussed below)
</code>

Let:
<code>
q(X,Y) :- r(X), p(Y).
</code>

serve as an example. Then ''q(X,Y)'' is called **resolvent**, or **goal**. Also, in, any query e.g. ''-? q(X,Y).'', ''q(X,Y)'' is termed **goal** (which Prolog tries to //prove//). ''r(X)'' and ''p(Y)'' are called sub-goals (which Prolog needs to prove, in order to prove ''q(X,Y)''.

There exist special coals, such as ''='' (unification), ''is'' (assignment) or ''=:='' (comparison):
  * the goal ''<term1> = <term2>'' will be satisfied iff the two terms unify. Under unification **variables become bound by a substitution**.
  * the goal ''<var> is <term>'' will be satisfied iff there is no type error, and will result in binding the variable to the **evaluation** of the term.
  * the goal ''<term1> =:= <term2>'' will be satisfied iff **the evaluation of each term** yields the same value.

We also note that the above syntax is not complete, but covers most of the Prolog programming aspects of this lecture.

==== Shorthands ====

Clauses such as:
<code prolog>
P(X,Y) :- true.
</code>

can also be written as:
<code prolog>
P(X,Y).
</code>

Also, a clause which involves unification such as:
<code prolog>
P(X) :- X = f(a). 
</code>

can also be written as:
<code prolog>
P(f(a),Y).
</code>

Shorthanded unification is often used for lists, and looks similar to Haskell pattern matching, e.g.
<code prolog>
size([],0).
size([_|T],R) :- size(T,R1), R is R1 + 1.
</code>

==== Common pitfalls ====

The following two clauses:

<code prolog>
... :- ..., X=p(A,B) , ...
... :- ..., p(A,B) , ...
</code>

may look similar, but their subgoals are actually very different:
  * the first is satisfied if the unification of ''X'' with ''p(A,B)'' succeeds (hence ''p(A,B)'' is merely a term here).
  * the second is satisfied if ''p(A,B)'' is satisfied.

===== Unification =====

We write $math[t =_S t'] to express that //term $math[t] unifies with $math[t'] under substitution $math[S]//.

A **substitution** is a set of pairs $math[(X,t)], where $math[X] is a variable and $math[t] is a term. As a shorthand, we write $math[t/X] instead of $math[(X,t)].

A **substitution** $math[S] is **consistent** iff for all pairs $math[t_1/X] and $math[t_2/X], $math[t_1 =_S t_2] (the variable $math[X] cannot be bound to different terms which do not unify).


==== Unification rules ====

In what follows, we give some basic unification rules which serve the purpose of illustration. These rules are not implemented per.se. in the Prolog unification process.

$math[\displaystyle\frac{}{atom =_S atom}]

$math[\displaystyle\frac{S \cup \{t/X\} \text{ is consistent }}{X =_S t}]

For instance, $math[X=_{\{cons(H,T)/X\}} cons(1,void)] since $math[\{cons(H,T)/X, cons(1,void)/X\}] is consistent. However $math[X =_{\{void/X\}} cons(1,void)] is false ($math[X] cannot at the same time be the empty list, and a list with one element).

$math[\displaystyle\frac{p_1=p_2, n = m, t_1 =_{S_1} t'_1, \ldots, t_n =_{S_1} t'_n, S = S_1 \cup \ldots \cup S_n \text{ is consistent} }{p_1(t_1, ..., t_n) =_S p_2(t'_1, \ldots, t'_m)}]

For instance:

$math[p(a,q(X,Y),Z) =_S p(X,q(Y,Y),q(X))], where $math[S] is $math[\{a/X,a/Y,q(a)/Z\}].

The unification algorithm from Prolog will compute **the most general substitution** or **the most general unifier** (MGU). In this lecture, we will not provide a formal definition for the MGU. However, we illustrate the concept via an example.

$math[p(X,q(X,Z)) =_S p(Y,q(Y,Z)] for $math[S = \{Y/X,X/Z\}] however $math[S] is not an MGU. The constraint that $math[Y] unifies with $math[Z] is superfluous. Here, an MGU is $math[S = \{Y/X\}].

==== Recursive substitutions ====

Does ''X=f(X)'' produce a valid substitution? Intuitively, such a substitution would contain $math[f(f(...))/X]. Prolog's unification algorithm is able to detect such recursive substitutions and will not loop. **Depending on the Prolog implementation (version)** at hand, the unification may fail or succeed.

==== Goal satisfaction (Backtracking) ====

During the satisfaction of a goal, Prolog keeps a current substitution which it iteratively updates. When a sub-goal is satisfied, a new "current substitution" is created.

Let us look in more depth at this.

<code prolog>
contains(E,[E|_]).
contains(E,[H|T]) :- E \= H, contains(E,T).
</code>

The execution tree for the goal ''-? contains(2,[1,2,3]).'' is given below:

<code>
contains(2,[1|[2,3]])
    2 \= 1
    contains(2,[2|_]) -> true (continue goal satisfaction)
    contains(2,[2|[3]])   
       2 \= 2 -> false. (re-satisfaction not possible).
</code>

Alternatively, let us look at how ''member'' is implemented:

<code prolog>
member(X,[X|_]).
member(X,[_|T]) :- member(X,T).
</code>

The execution tree for ''member(2,[1,2,3])'' is shown below:
<code>
member(2,[_|[2,3]])
   member(2,[2|_]) -> true (continue goal satisfaction).
   member(2,[_|[3])
      member(2,[_|[]])
         member(2,[]) -> false (re-satisfaction not possible).
</code>

The difference between ''contains'' and ''member'' is that, while the former stops once an element in the list has been found, the second continues search. Is there a good motivation for the ''member'' implementation?
  * build a goal tree for ''member(X,[1,2,3])''
  * build a goal tree for ''contains(X,[1,2,3])''.


=== Backtracking ===

Note that goal satisfaction in Prolog is similar to //pruned backtracking//. 

==== Cut (!) - pruning the goal tree ====

Consider the following program:
<code prolog>
f(a).
f(b).
g(a).
g(b).
     
q(X) :- f(X), g(X).
</code>

The proof tree for ''-? q(X)'' is:
<code prolog>
f(X)
  X = a (subgoal satisfied). Current substitution is {a/X}
    g(X)
      g(a) true. The goal q(X) is satisfied under {a/X}. Attempt re-satisfaction ...
      ... resatisfaction not possible for g(X) under {a/X}.
  X = b. Current substitution is {b/X}
     g(X)
       g(b) true. Attempt re-satisfaction....
       ... not possible
  not possible to re-satisfy f(X). 
not possible to re-satisfy q(X). stop.  
</code>

In Prolog, the special goal //cut// written ''!''is satisfied according to the following rules:
  - when it is first received, it is **implicitly satisfied**.
  - when there is an attempt to **re-satisfy it**, it **fails**.

Let us modify the previous clause to:
<code prolog>
q(X) :- f(X), !, g(X).
</code>

The corresponding proof tree is:
<code prolog>
f(X)
  X = a (subgoal satisfied). Current substitution is {a/X}
    ! (implicitly satisfied)
    g(X)
      g(a) true. The goal q(X) is satisfied under {a/X}. Attempt re-satisfaction ...
      ... resatisfaction not possible for g(X) under {a/X}.
  X = b. Current substitution is {b/X}
    ! (implicitly fails). 
q(X) cannot be re-satisfied. stop
</code>

We turn to a more elaborate example:
<code prolog>
f(a).
f(b).
g(a,c).
g(a,d).
g(b,c).
g(b,d).
  
q(X,Y) :- f(X), !, g(X,Y).
q(test,test).
</code>

The example illustrates a side-effect of the cut semantics: ''q(X,Y)'' will not be satisfied for $math[\{test/X,test/Y\}]. This shows that the effect of cut also extends **to the current goal**, not only **the current clause of the current goal**.

Similarly, we have:
<code prolog>
q(test,test) :- !.
q(X,Y) :- f(X), !, g(X,Y).
</code>

where cut prevents the re-satisfaction of ''q'' via the second clause (the second cut will not even be reached). 

Also, consider the following code which extends the previous example:
<code prolog>
r(X,Y) :- q(X,Y).
r(1,2).
</code>

The goal ''-? r(X,Y)'' will be satisfied for $math[\{1/X,2/Y\}], which shows that cut does not have any effect on ''r'' - thus, cut should not be interpreted as a //**global proof tree pruner**//.

==== Negation ====

Prolog implements negation as **negation-as failure**. This is also called **the closed-world assumption**. In short, the negation ''not(G)'' should be interpreted as //G cannot be proved in the current program//. 

Note that, in First-Order Logic, the negation of a sentence $math[p] is interpreted more generally:
  * it has its own //proof tree// which is independent of that for $math[p]
  * it is possible to have //theories// (programs) where both $math[p] and $math[~p] are true. Such theories are called **inconsistent**. It is also possible that $math[p] is provable while $math[~p] is not. Such a theory is called **incomplete**.

The Prolog implementation of not is given below:
<code prolog>
not(G) :- G,!,fail.
not(_).
</code>

It is also a good illustration of a **design pattern** for logical programming, which is often used in Prolog:
  * The key property of ''not'' is that **it satisfies without modifying the current substitution**, **even if G may bound variables**: 
    * In the first clause, if G is satisfied, cut is reached for the first time (satisfies). Subsequently failure occurs and re-satisfaction is prevented. The second clause is never reached. 
    * However, if G is not satisfied, the cut is never reached, and ''not'' succeeds trivially, **without binding any variable from G**. 


The reason while ''X \= H'' fails in our ''contains'' example is that it is actually a syntactic sugar for ''not(X = H)''.

More precisely, note that, in: ''-? H = 1, not(X = H).''
  * ''X = H'' succeeds hence ''not(X = H)'', fails.
  * re-satisfaction of ''not(X = H)'' is not possible, hence any supergoal having ''not(X = H)'' on a branch will fail also.