Formal syntax, Unification, Backtracking, Cut, Negation
- Prolog syntax
  - Shorthands
  - Common pitfalls
- Unification

Formal syntax, Unification, Backtracking, Cut, Negation

Prolog syntax

The basic programming building block in Prolog is the clause. To define it formally, we first introduce:

<var> ::= alphanumeric sequence starting with capital
<term> ::= <number> | <atom> | <var> | <predicate>(<term1> ..., <term_n>)

The definition of terms is slightly restrictive. For instance, terms are also arithmetic expressions (e.g. 1 + X), list patterns (e.g. [1|[]]), etc. But each such term can be expressed as a predicate, as already illustrated for lists, in the previous lecture.

A clause is defined via the following grammar:

<clause> ::= <res> :- <goal_1>, ..., <goal_m>.
<res> ::= <predicate>(<var_1>, ... <var_k>)
<goal_i> ::= <predicate>(<var_1>, ... <var_k>) |
             <term> = <term> |                     // unification
             <var> is <term> |                     // arithmetic assignment
             <term> =:= <term> |                   // arithmetic comparison
             true |                                // default satisfying goal
             fail |                                // default-failing goal
             ! |                                   // cut (discussed below)

Let:

q(X,Y) :- r(X), p(Y).

serve as an example. Then q(X,Y) is called resolvent, or goal. Also, in, any query e.g. -? q(X,Y)., q(X,Y) is termed goal (which Prolog tries to prove). r(X) and p(Y) are called sub-goals (which Prolog needs to prove, in order to prove q(X,Y).

There exist special coals, such as = (unification), is (assignment) or =:= (comparison):

the goal <term1> = <term2> will be satisfied iff the two terms unify. Under unification variables become bound by a substitution.
the goal <var> is <term> will be satisfied iff there is no type error, and will result in binding the variable to the evaluation of the term.
the goal <term1> =:= <term2> will be satisfied iff the evaluation of each term yields the same value.

We also note that the above syntax is not complete, but covers most of the Prolog programming aspects of this lecture.

Shorthands

Clauses such as:

P(X,Y) :- true.

can also be written as:

P(X,Y).

Also, a clause which involves unification such as:

P(X) :- X = f(a).

can also be written as:

P(f(a),Y).

Shorthanded unification is often used for lists, and looks similar to Haskell pattern matching, e.g.

size([],0).
size([_|T],R) :- size(T,R1), R is R1 + 1.

Common pitfalls

The following two clauses:

... :- ..., X=p(A,B) , ...
... :- ..., p(A,B) , ...

may look similar, but their subgoals are actually very different:

the first is satisfied if the unification of X with p(A,B) succeeds (hence p(A,B) is merely a term here).
the second is satisfied if p(A,B) is satisfied.

Unification

We write $ t =_S t'$ to express that term $ t$ unifies with $ t'$ under substitution $ S$ .

A substitution is a set of pairs $ (X,t)$ , where $ X$ is a variable and $ t$ is a term. As a shorthand, we write $ t/X$ instead of $ (X,t)$ .

A substitution $ S$ is consistent iff for all pairs $ t_1/X$ and $ t_2/X$ , $ t_1 =_S t_2$ (the variable $ X$ cannot be bound to different terms which do not unify).

Unification rules

In what follows, we give some basic unification rules which serve the purpose of illustration. These rules are not implemented per.se. in the Prolog unification process.

$ \displaystyle\frac{}{atom =_S atom}$

$ \displaystyle\frac{S \cup \{t/X\} \text{ is consistent }}{X =_S t}$

For instance, $ X=_{\{cons(H,T)/X\}} cons(1,void)$ since $ \{cons(H,T)/X, cons(1,void)/X\}$ is consistent. However $ X =_{\{void/X\}} cons(1,void)$ is false ($ X$ cannot at the same time be the empty list, and a list with one element).

$ \displaystyle\frac{p_1=p_2, n = m, t_1 =_{S_1} t'_1, \ldots, t_n =_{S_1} t'_n, S = S_1 \cup \ldots \cup S_n \text{ is consistent} }{p_1(t_1, \ldots, t_n) =_S p_2(t'_1, \ldots, t'_m)}$

For instance:

$ p(a,q(X,Y),Z) =_S p(X,q(Y,Y),q(X))$ , where $ S$ is $ \{a/X,a/Y,q(a)/Z\}$ .

The unification algorithm from Prolog will compute the most general substitution or the most general unifier (MGU). In this lecture, we will not provide a formal definition for the MGU. However, we illustrate the concept via an example.

$ p(X,q(X,Z)) =_S p(Y,q(Y,Z)$ for $ S = \{Y/X,X/Z\}$ however $ S$ is not an MGU. The constraint that $ Y$ unifies with $ Z$ is superfluous. Here, an MGU is $ S = \{Y/X\}$ .

Recursive substitutions

Does X=f(X) produce a valid substitution? Intuitively, such a substitution would contain $ f(f(\ldots))/X$ . Prolog's unification algorithm is able to detect such recursive substitutions and will not loop. Depending on the Prolog implementation (version) at hand, the unification may fail or succeed.

Goal satisfaction (Backtracking)

During the satisfaction of a goal, Prolog keeps a current substitution which it iteratively updates. When a sub-goal is satisfied, a new “current substitution” is created.

Let us look in more depth at this.

contains(E,[E|_]).
contains(E,[H|T]) :- E \= H, contains(E,T).

The execution tree for the goal -? contains(2,[1,2,3]). is given below:

contains(2,[1|[2,3]])
    2 \= 1
    contains(2,[2|_]) -> true (continue goal satisfaction)
    contains(2,[2|[3]])   
       2 \= 2 -> false. (re-satisfaction not possible).

Alternatively, let us look at how member is implemented:

member(X,[X|_]).
member(X,[_|T]) :- member(X,T).

The execution tree for member(2,[1,2,3]) is shown below:

member(2,[_|[2,3]])
   member(2,[2|_]) -> true (continue goal satisfaction).
   member(2,[_|[3])
      member(2,[_|[]])
         member(2,[]) -> false (re-satisfaction not possible).

The difference between contains and member is that, while the former stops once an element in the list has been found, the second continues search. Is there a good motivation for the member implementation?

build a goal tree for member(X,[1,2,3])
build a goal tree for contains(X,[1,2,3]).

Backtracking

Note that goal satisfaction in Prolog is similar to pruned backtracking.

Cut (!) - pruning the goal tree

Consider the following program:

f(a).
f(b).
g(a).
g(b).
 
q(X) :- f(X), g(X).

The proof tree for -? q(X) is:

f(X)
  X = a (subgoal satisfied). Current substitution is {a/X}
    g(X)
      g(a) true. The goal q(X) is satisfied under {a/X}. Attempt re-satisfaction ...
      ... resatisfaction not possible for g(X) under {a/X}.
  X = b. Current substitution is {b/X}
     g(X)
       g(b) true. Attempt re-satisfaction....
       ... not possible
  not possible to re-satisfy f(X). 
not possible to re-satisfy q(X). stop.

In Prolog, the special goal cut written !is satisfied according to the following rules:

when it is first received, it is implicitly satisfied.
when there is an attempt to re-satisfy it, it fails.

Let us modify the previous clause to:

q(X) :- f(X), !, g(X).

The corresponding proof tree is:

f(X)
  X = a (subgoal satisfied). Current substitution is {a/X}
    ! (implicitly satisfied)
    g(X)
      g(a) true. The goal q(X) is satisfied under {a/X}. Attempt re-satisfaction ...
      ... resatisfaction not possible for g(X) under {a/X}.
  X = b. Current substitution is {b/X}
    ! (implicitly fails). 
q(X) cannot be re-satisfied. stop

We turn to a more elaborate example:

f(a).
f(b).
g(a,c).
g(a,d).
g(b,c).
g(b,d).
 
q(X,Y) :- f(X), !, g(X,Y).
q(test,test).

The example illustrates a side-effect of the cut semantics: q(X,Y) will not be satisfied for $ \{test/X,test/Y\}$ . This shows that the effect of cut also extends to the current goal, not only the current clause of the current goal.

Similarly, we have:

q(test,test) :- !.
q(X,Y) :- f(X), !, g(X,Y).

where cut prevents the re-satisfaction of q via the second clause (the second cut will not even be reached).

Also, consider the following code which extends the previous example:

r(X,Y) :- q(X,Y).
r(1,2).

The goal -? r(X,Y) will be satisfied for $ \{1/X,2/Y\}$ , which shows that cut does not have any effect on r - thus, cut should not be interpreted as a global proof tree pruner.

Negation

Prolog implements negation as negation-as failure. This is also called the closed-world assumption. In short, the negation not(G) should be interpreted as G cannot be proved in the current program.

Note that, in First-Order Logic, the negation of a sentence $ p$ is interpreted more generally:

it has its own proof tree which is independent of that for $ p$
it is possible to have theories (programs) where both $ p$ and $ ~p$ are true. Such theories are called inconsistent. It is also possible that $ p$ is provable while $ ~p$ is not. Such a theory is called incomplete.

The Prolog implementation of not is given below:

not(G) :- G,!,fail.
not(_).

It is also a good illustration of a design pattern for logical programming, which is often used in Prolog:

The key property of not is that it satisfies without modifying the current substitution, even if G may bound variables:
- In the first clause, if G is satisfied, cut is reached for the first time (satisfies). Subsequently failure occurs and re-satisfaction is prevented. The second clause is never reached.
- However, if G is not satisfied, the cut is never reached, and not succeeds trivially, without binding any variable from G.

The reason while X \= H fails in our contains example is that it is actually a syntactic sugar for not(X = H).

More precisely, note that, in: -? H = 1, not(X = H).

X = H succeeds hence not(X = H), fails.
re-satisfaction of not(X = H) is not possible, hence any supergoal having not(X = H) on a branch will fail also.

Table of Contents