Edit this page Backlinks This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Formal syntax, Unification, Backtracking, Cut, Negation ====== ===== Prolog syntax ===== The basic programming building block in Prolog is the **clause**. To define it formally, we first introduce: <code> <var> ::= alphanumeric sequence starting with capital <term> ::= <number> | <atom> | <var> | <predicate>(<term1> ..., <term_n>) </code> The definition of **terms** is slightly restrictive. For instance, terms are also arithmetic expressions (e.g. ''1 + X''), list //patterns// (e.g. [1|[]]), etc. But each such term can be expressed as a **predicate**, as already illustrated for lists, in the previous lecture. A clause is defined via the following grammar: <code> <clause> ::= <res> :- <goal_1>, ..., <goal_m>. <res> ::= <predicate>(<var_1>, ... <var_k>) <goal_i> ::= <predicate>(<var_1>, ... <var_k>) | <term> = <term> | // unification <var> is <term> | // arithmetic assignment <term> =:= <term> | // arithmetic comparison true | // default satisfying goal fail | // default-failing goal ! | // cut (discussed below) </code> Let: <code> q(X,Y) :- r(X), p(Y). </code> serve as an example. Then ''q(X,Y)'' is called **resolvent**, or **goal**. Also, in, any query e.g. ''-? q(X,Y).'', ''q(X,Y)'' is termed **goal** (which Prolog tries to //prove//). ''r(X)'' and ''p(Y)'' are called sub-goals (which Prolog needs to prove, in order to prove ''q(X,Y)''. There exist special coals, such as ''='' (unification), ''is'' (assignment) or ''=:='' (comparison): * the goal ''<term1> = <term2>'' will be satisfied iff the two terms unify. Under unification **variables become bound by a substitution**. * the goal ''<var> is <term>'' will be satisfied iff there is no type error, and will result in binding the variable to the **evaluation** of the term. * the goal ''<term1> =:= <term2>'' will be satisfied iff **the evaluation of each term** yields the same value. We also note that the above syntax is not complete, but covers most of the Prolog programming aspects of this lecture. ==== Shorthands ==== Clauses such as: <code prolog> P(X,Y) :- true. </code> can also be written as: <code prolog> P(X,Y). </code> Also, a clause which involves unification such as: <code prolog> P(X) :- X = f(a). </code> can also be written as: <code prolog> P(f(a),Y). </code> Shorthanded unification is often used for lists, and looks similar to Haskell pattern matching, e.g. <code prolog> size([],0). size([_|T],R) :- size(T,R1), R is R1 + 1. </code> ==== Common pitfalls ==== The following two clauses: <code prolog> ... :- ..., X=p(A,B) , ... ... :- ..., p(A,B) , ... </code> may look similar, but their subgoals are actually very different: * the first is satisfied if the unification of ''X'' with ''p(A,B)'' succeeds (hence ''p(A,B)'' is merely a term here). * the second is satisfied if ''p(A,B)'' is satisfied. ===== Unification ===== We write $math[t =_S t'] to express that //term $math[t] unifies with $math[t'] under substitution $math[S]//. A **substitution** is a set of pairs $math[(X,t)], where $math[X] is a variable and $math[t] is a term. As a shorthand, we write $math[t/X] instead of $math[(X,t)]. A **substitution** $math[S] is **consistent** iff for all pairs $math[t_1/X] and $math[t_2/X], $math[t_1 =_S t_2] (the variable $math[X] cannot be bound to different terms which do not unify). ==== Unification rules ==== In what follows, we give some basic unification rules which serve the purpose of illustration. These rules are not implemented per.se. in the Prolog unification process. $math[\displaystyle\frac{}{atom =_S atom}] $math[\displaystyle\frac{S \cup \{t/X\} \text{ is consistent }}{X =_S t}] For instance, $math[X=_{\{cons(H,T)/X\}} cons(1,void)] since $math[\{cons(H,T)/X, cons(1,void)/X\}] is consistent. However $math[X =_{\{void/X\}} cons(1,void)] is false ($math[X] cannot at the same time be the empty list, and a list with one element). $math[\displaystyle\frac{p_1=p_2, n = m, t_1 =_{S_1} t'_1, \ldots, t_n =_{S_1} t'_n, S = S_1 \cup \ldots \cup S_n \text{ is consistent} }{p_1(t_1, ..., t_n) =_S p_2(t'_1, \ldots, t'_m)}] For instance: $math[p(a,q(X,Y),Z) =_S p(X,q(Y,Y),q(X))], where $math[S] is $math[\{a/X,a/Y,q(a)/Z\}]. The unification algorithm from Prolog will compute **the most general substitution** or **the most general unifier** (MGU). In this lecture, we will not provide a formal definition for the MGU. However, we illustrate the concept via an example. $math[p(X,q(X,Z)) =_S p(Y,q(Y,Z)] for $math[S = \{Y/X,X/Z\}] however $math[S] is not an MGU. The constraint that $math[Y] unifies with $math[Z] is superfluous. Here, an MGU is $math[S = \{Y/X\}]. ==== Recursive substitutions ==== Does ''X=f(X)'' produce a valid substitution? Intuitively, such a substitution would contain $math[f(f(...))/X]. Prolog's unification algorithm is able to detect such recursive substitutions and will not loop. **Depending on the Prolog implementation (version)** at hand, the unification may fail or succeed. ==== Goal satisfaction (Backtracking) ==== During the satisfaction of a goal, Prolog keeps a current substitution which it iteratively updates. When a sub-goal is satisfied, a new "current substitution" is created. Let us look in more depth at this. <code prolog> contains(E,[E|_]). contains(E,[H|T]) :- E \= H, contains(E,T). </code> The execution tree for the goal ''-? contains(2,[1,2,3]).'' is given below: <code> contains(2,[1|[2,3]]) 2 \= 1 contains(2,[2|_]) -> true (continue goal satisfaction) contains(2,[2|[3]]) 2 \= 2 -> false. (re-satisfaction not possible). </code> Alternatively, let us look at how ''member'' is implemented: <code prolog> member(X,[X|_]). member(X,[_|T]) :- member(X,T). </code> The execution tree for ''member(2,[1,2,3])'' is shown below: <code> member(2,[_|[2,3]]) member(2,[2|_]) -> true (continue goal satisfaction). member(2,[_|[3]) member(2,[_|[]]) member(2,[]) -> false (re-satisfaction not possible). </code> The difference between ''contains'' and ''member'' is that, while the former stops once an element in the list has been found, the second continues search. Is there a good motivation for the ''member'' implementation? * build a goal tree for ''member(X,[1,2,3])'' * build a goal tree for ''contains(X,[1,2,3])''. === Backtracking === Note that goal satisfaction in Prolog is similar to //pruned backtracking//. ==== Cut (!) - pruning the goal tree ==== Consider the following program: <code prolog> f(a). f(b). g(a). g(b). q(X) :- f(X), g(X). </code> The proof tree for ''-? q(X)'' is: <code prolog> f(X) X = a (subgoal satisfied). Current substitution is {a/X} g(X) g(a) true. The goal q(X) is satisfied under {a/X}. Attempt re-satisfaction ... ... resatisfaction not possible for g(X) under {a/X}. X = b. Current substitution is {b/X} g(X) g(b) true. Attempt re-satisfaction.... ... not possible not possible to re-satisfy f(X). not possible to re-satisfy q(X). stop. </code> In Prolog, the special goal //cut// written ''!''is satisfied according to the following rules: - when it is first received, it is **implicitly satisfied**. - when there is an attempt to **re-satisfy it**, it **fails**. Let us modify the previous clause to: <code prolog> q(X) :- f(X), !, g(X). </code> The corresponding proof tree is: <code prolog> f(X) X = a (subgoal satisfied). Current substitution is {a/X} ! (implicitly satisfied) g(X) g(a) true. The goal q(X) is satisfied under {a/X}. Attempt re-satisfaction ... ... resatisfaction not possible for g(X) under {a/X}. X = b. Current substitution is {b/X} ! (implicitly fails). q(X) cannot be re-satisfied. stop </code> We turn to a more elaborate example: <code prolog> f(a). f(b). g(a,c). g(a,d). g(b,c). g(b,d). q(X,Y) :- f(X), !, g(X,Y). q(test,test). </code> The example illustrates a side-effect of the cut semantics: ''q(X,Y)'' will not be satisfied for $math[\{test/X,test/Y\}]. This shows that the effect of cut also extends **to the current goal**, not only **the current clause of the current goal**. Similarly, we have: <code prolog> q(test,test) :- !. q(X,Y) :- f(X), !, g(X,Y). </code> where cut prevents the re-satisfaction of ''q'' via the second clause (the second cut will not even be reached). Also, consider the following code which extends the previous example: <code prolog> r(X,Y) :- q(X,Y). r(1,2). </code> The goal ''-? r(X,Y)'' will be satisfied for $math[\{1/X,2/Y\}], which shows that cut does not have any effect on ''r'' - thus, cut should not be interpreted as a //**global proof tree pruner**//. ==== Negation ==== Prolog implements negation as **negation-as failure**. This is also called **the closed-world assumption**. In short, the negation ''not(G)'' should be interpreted as //G cannot be proved in the current program//. Note that, in First-Order Logic, the negation of a sentence $math[p] is interpreted more generally: * it has its own //proof tree// which is independent of that for $math[p] * it is possible to have //theories// (programs) where both $math[p] and $math[~p] are true. Such theories are called **inconsistent**. It is also possible that $math[p] is provable while $math[~p] is not. Such a theory is called **incomplete**. The Prolog implementation of not is given below: <code prolog> not(G) :- G,!,fail. not(_). </code> It is also a good illustration of a **design pattern** for logical programming, which is often used in Prolog: * The key property of ''not'' is that **it satisfies without modifying the current substitution**, **even if G may bound variables**: * In the first clause, if G is satisfied, cut is reached for the first time (satisfies). Subsequently failure occurs and re-satisfaction is prevented. The second clause is never reached. * However, if G is not satisfied, the cut is never reached, and ''not'' succeeds trivially, **without binding any variable from G**. The reason while ''X \= H'' fails in our ''contains'' example is that it is actually a syntactic sugar for ''not(X = H)''. More precisely, note that, in: ''-? H = 1, not(X = H).'' * ''X = H'' succeeds hence ''not(X = H)'', fails. * re-satisfaction of ''not(X = H)'' is not possible, hence any supergoal having ''not(X = H)'' on a branch will fail also.