Lab 7. Lambda Calculus

Lab 7. Lambda Calculus

7.0. What? Why?

Lambda Calculus is a universal model of computation (can be used to simulate any Turing Machine) based on function abstraction and application. It has a very simple semantic that can be used to study properties of computation.

The first thing to take note of is that EVERYTHING is a function (an algorithm, the input and the output are all functions).

Let's start with a very simple example in Scala.

def apply(f: Int => Int, x: Int): Int = f(x)

The first thing we need to adjust for the function to look more like Lambda Calculus is that since everything is a function, everything will be untyped.

def apply(f, x) = f(x)

An abstraction Lambda Calculus does is that it treats all functions as anonymous (it doesn't give them explicit names).

(f, x) => f(x)

Another abstraction is that all functions are curried (only take 1 input and return 1 output).

f => (x => f(x))

Now we can re-write the function using lambda calculus syntax (instead of Scala), and we will get a valid lambda expression.
$ \lambda f.\lambda x.(f \ x) $

Formal Definition

Given a set of variables VARS, an expression under lambda calculus can be:

variable	$ x $	$ x \in VARS $
function	$ \lambda x.e $	$ x \in VARS $, $ e $ is a $ \lambda $-expression
application	$ (e_1 \ e_2) $	$ e_1, e_2 $ are $ \lambda $-expressions

To evaluate $\lambda$-expressions, there are two types of reduction operations:

$\alpha$-conversion: given a expression: $ \lambda x.e $, you can rename all occurences of x in e with y (used for avoiding name collisions).
$\beta$-reduction: given a expression: $(\lambda x.body \ param)$, you can replace all occurences of x in body with param (We will denote this action with: $ body[x \ / \ param] $).

Be careful about the difference between $ E_1 = (\lambda x.e_1 \ e_2) $ and $ E_2 = e_1[x \ / \ e_2] $. The former denotes a expression made from a application between a function and a expression , while the latter is the expression obtained applying $ \beta $-reduction to the former. We say $ E_1 $ is reducible to $ E_2 $: ($ E_1 \Rightarrow E_2 $).

We say two expressions are equivalent if it is possible to get one of them from the other using $\alpha$-conversion and $\beta$-reductions.

If a expression cannot be reduced further using $ \beta $-reductions, we say the expression is in $ \beta $-normal form.

7.1. Free and bound variables

Take the following Scala snippet as an example:

def add(x: Int) = x + y

We can say that the second occurence of $ x $ is bounded by the $ x $ that appears as a function parameter. When we call the function, the occurence of $ x $ is replaced by the argument that was provided to $ add $. In contrast, $ y $ is a free variable.
This code might look weird, where does $ y $ come from? What does it do? Why would we use a variable that we don't instantiate (i.e. is not bound to anything)? Well, the snippet actually comes from a broader context:

def add_all(x: List[Int], y: Int) = {
  def add(x: Int) = x + y
  x.map(add)
}

In this new snippet we can see that all variables are bounded, the free variable from before is bounded by the outer function, but only the free variable, notice that $ x $ is still bounded by the inner function, and the $ x $ parameter of $ add\_all $ (that is a List[Int]) is 'invisible' inside $ add $.
The importance of free variables is that only free occurences of a sub-expression can be bounded by the outer expression.

Translating to lambda calculus, when reducing $ (\lambda x.e_1 \ e_2) $ to $ e_1[x \ / \ e_2] $, only free occurences of $ x $ in $ e_1 $ will be replaced by $ e_2 $.

More generally, we say that:

if all occurences of a variable in a expression are bounded , the variable is said to be bounded in that expression
if one occurence of a variable in a expression is free , the variable is said to be free in that expression

Exercise

7.1.1. For every variable occurence, mention if it's a free or a bounded occurence:

$ \lambda y.(\lambda x.x \ (x \ y)) $
$ (\lambda x.(x \ \lambda y.((x \ y) \ z)) \ (x \ \lambda y.x)) $
$ (\lambda f.(\lambda x.f \ (x \ x)) \ (\lambda x.f \ (x \ x))) $

Solutions 7.1.1

Solutions:

1. $ \lambda y.(\lambda x.x_1 \ (x_2 \ y_1)) $
Bounded occurrences: $ x_1 $ (to $ \lambda x $), $ y_1 $ (to $ \lambda y $).
Free occurrences: $ x_2 $ (Not bounded by $ \lambda x.x_1 $)
In the sub-context of $ \lambda x.x_1 \ (x_2 \ y_1) $, $ y_1 $ is free.

2. $ (\lambda x.(x_1 \ \lambda y_1.((x_2 \ y_2) \ z_1)) \ (x_3 \ \lambda y_3.x_4)) $
Bounded occurrences: $ x_1, x_2 $ (to $ \lambda x $), $ y_2$ (to $ \lambda y_1 $).
Free occurrences: $ z_1, x_3, x_4 $.
In the sub-context of $ \lambda y_1.((x_2 \ y_2) \ z_1) $, $ x_2 $ is free.

3. $ (\lambda f.(\lambda x_1.f_1 \ (x_2 \ x_3)) \ (\lambda x_4.f_2 \ (x_5 \ x_6))) $
Bounded occurrences: $ f_1 $ (to $ \lambda f $).
Free occurrences: $ x_2, x_3, f_2, x_5, x_6 $.

7.2. Reduction rules

Using what we learned from free and bounded variables, we can define a algorithm for $\beta$-reduction, given a expression $ e_1[x \ / \ e_2] $:

$ e_1 $	$ e_1[x \ / \ e_2] $	condition
$ x $	$ e_2 $
$ y $	$ y $	$ x \neq y $
$ E_1 \ E_2 $	$ E_1[x \ / \ e_2] \ E_2[x \ / \ e_2] $
$ \lambda x.e $	$ \lambda x.e $
$ \lambda y.e $	$ \lambda y.e[x \ / \ e_2] $	$ x \neq y $, $ y $ does not appear free in $ e_2 $
$ \lambda y.e $	$ \{\lambda z.e[y \ / \ z]\}[x \ / \ e_2] $	$ x \neq y $, $ y $ appears free in $ e_2 $	( $ z $ is a new variable that is not free in $ e $ or $ e_2 $ )

Evaluation order

Q: If we have multiple redexes in a expression, which one do we evaluate?

A: We can evaluate any of them, and it is guaranteed by Church-Rosser theorem that if the expression is reducible, we will eventually get the same $ \beta $-normal form.

To not just randomly choose redexes, there exist reduction strategies, from which we will use the Normal Order and Applicative Order:

Normal Order evaluation consist of always reducing the leftmost, outermost redex (whenever possible, subsitute the arguments into the function body)
Applicative Order evaluation consist of always reducing the leftmost, innermost redex (always reduce the function argument before the function itself)

A expression of the form $(\lambda x.e_1 \ e_2)$ is also called a redex (reducible expression)

Exercise

7.2.1. Evaluate in both Normal Order and Applicative Order the following expressions:

$ (\lambda x.\lambda y.\lambda z.((x \ z) \ y) \ \lambda x.\lambda y.x) $
$ ((\lambda x.\lambda y.((x \ y) \ x) \ \lambda x.\lambda y.x) \ (\lambda x.\lambda y.\lambda z.((x \ z) \ y) \ \lambda x.\lambda y.y))$
$ (\lambda x.y \ (\lambda x.(x \ x) \ \lambda x.(x \ x)))$

Solutions 7.2.1

Solutions:

Lambda calculus as a programming language

The Church-Turing thesis asserts that any computable function can be computed using lambda calculus (or Turing Machines or equivalent models).

How can this be? Everything in Lambda Calculus is a function, there are no numbers to compute stuff with. Well, while there are not the numbers we are used to, we can define higher-order functions that are analogs for concepts we are familiar with and use them instead.

The representations we are going to present further are also called Church encodings, because they were first used by Alonzo Church, the inventor of Lambda Calculus.

7.3. Booleans

We can encode boolean values TRUE and FALSE in lambda calculus as functions that take 2 values, x and y, and return the first (for TRUE) or second (for FALSE) value.

$ TRUE = \lambda x.\lambda y.x$
$ FALSE = \lambda x.\lambda y.y$

As we defined it, TRUE is sometimes called the K-Combinator (or Kestrel), and FALSE the KI-Combinator (or Kite).

Some common operation on booleans (that were discussed during the lecture) are:

$ AND = \lambda x.\lambda y.((x \ y) \ x) $
$ OR = \lambda x.\lambda y.((x \ x) \ y) $
$ NOT = \lambda x.((x \ FALSE) \ TRUE) $
$ IF = \lambda c.\lambda t.\lambda e.((c t) e) $

Click to display ⇲

Click to hide ⇱

NOT can also be written as:

$ NOT = \lambda x.\lambda a.\lambda b.((x \ b) \ a) $

You can convince yourself that this works by evaluating $ NOT \ TRUE $ and $ NOT \ FALSE $. This way of writting NOT is also called the C-Combinator (or Cardinal).

Exercises
7.3.1. Define the $ XOR $ operations over booleans.

7.3.2. Define the $ NAND $ operations over booleans.

7.3.3. Define the $ NOR $ operations over booleans.

Solutions 7.3

Solutions:

https://aartaka.me/lambda-3.html

7.3.1:

V1: xor = $ \lambda a.\lambda b.a (not b) b $

V2: XOR = $ \lambda a.\lambda b.((OR (AND (NOT a)) b)) (AND (NOT b)) a))$

7.3.2:

V1: nand = $ \lambda a.\lambda b.a (b false true) true $

V2: NAND = $ \lambda a.\lambda b.(NOT (AND a b)) $

7.3.3:

V1: nor = $ \lambda a.\lambda b.a (NOT b) false $

V2: NOR = $ $ \lambda a.\lambda b.(NOT (OR a b)) $

Pairs - Lecture Reminder

We can also encode data structures . We will only look at one of the simpler ones, the pair.
A pair encapsulates two variables together, that we can later access using $ FIRST $ and $ SECOND $ .

$ PAIR = \lambda a.\lambda b.\lambda z.((z \ a )\ b) $
$ FIRST = \lambda p.(p \ TRUE) $
$ SECOND = \lambda p.(p \ FALSE) $

The $ PAIR $ higher-order function we defined is also called the V-Combinator (or Vireo).

7.4. Natural Numbers - Church numerals

Church numerals represent natural numbers as higher-order functions. Under this representation, the number n is a function that maps f to its n-fold composition.

$ N0 = \lambda f.\lambda x. x $
$ N1 = \lambda f.\lambda x. (f \ x) $
$ N2 = \lambda f.\lambda x. (f \ (f \ x)) $
…

Does N0 look familiar? It's the same as FALSE if you rename the variables (using $\alpha$-reduction).

You can also define operation on church numerals, some (that were discussed during the lecture) are:

$ SUCC = \lambda n.\lambda f.\lambda x.(f \ ((n \ f) \ x)) $
$ ISZERO = \lambda n.((n \ \lambda x.FALSE) \ TRUE) $
$ ADD = \lambda n.\lambda m.\lambda f.\lambda x.((n \ f) \ ((m \ f) \ x)) $

Exercises

7.4.1. Define multiplication under church numerals: $ MULT = \lambda n.\lambda m. \ \ldots $ (Hint: you can do it without the Y-Combinator)

7.4.2. Define exponentiation under church numerals: $ EXP = \lambda n.\lambda m. \ \ldots $

7.4.3. (*) Define the predecessor operator, that takes a number and returns the number prior to it. What's the predecessor of 0? Evaluate $ (PRED \ N0) $.

7.4.4. Define substraction under church numerals: $ SUB = \lambda n.\lambda m. \ \ldots $ (Hint: use $ PRED $). What happens if you try to substract a bigger number from a smaller one? Evaluate $ (SUB \ N1 \ N2 )$.

7.4.5. Define $ LEQ $ (less or equal). $ LEQ \ n \ m $ should return TRUE if $ n \leq m $ and FALSE if $ n > m $.

7.4.6. Define $ EQ $ (equality). $ EQ \ n \ m $ should return TRUE if $ n = m $ and FALSE otherwise.

Solution 7.4.

7.4.3.

Let's start with defining a shift-and-increment operator:
$ \phi' = \lambda x.(PAIR \ x \ (SUCC \ x)) $

This takes a number $ n $, and returns a pair made up of the number and it's succesor ( $ n $ , $ (SUCC \ n) $ ).

To make this function be able to be iterated multiple times (on itself), we make the input another pair, where the second value is the 'real' input:
$ \phi = \lambda p.((PAIR \ (SECOND \ p)) \ (SUCC \ (SECOND \ p))) $

This takes a pair ( $ n $, $ (SUCC \ n) $) and returns another pair ($ (SUCC \ n) $, $ (SUCC \ (SUCC \ n)) $
Now we can just iterate this n times starting with $ N0 $, and we get a pair ($ n - 1 $, $ n $), where the first value is our predecesor:
$ PRED = \lambda n.(FIRST \ ((n \ \phi) \ (PAIR \ N0 \ N0))) $

An alternative solution, that uses a value container is the following (unfortunately, we will not explain this in further detail here):
$ PRED = \lambda n.\lambda f.\lambda x.(((n \ (\lambda g.\lambda h.(h \ (g \ f)))) \ \lambda u.x) \ \lambda v.v) $

7.5. Recursion and the Sage Bird

In lambda calculus, recursion is achieved using the fixed-point combinator (or Y combinator, “Why” bird or Sage bird). A fixed-point combinator is a higher-order function that returns some fixed point of it's argument function (x is a fixed pointed for a function f if $ f(x) = x $). That means: $ f \ (fix \ f) = fix \ f $ . And by repeated application: $ fix \ f = f \ (f \ (\ldots f \ (fix \ f)\ldots)) $

The Y-combinator in lambda calculus looks like this:

$ FIX = \lambda f.(\lambda x.(f \ (x \ x)) \ \lambda x.(f \ (x \ x))) $