Lab 7. Lambda Calculus

This is an old revision of the document!

Lambda Calculus is a formal system for expressing computation based on function abstraction and application. Because it has very simple semnatics it is used for formally studying properties of computations.

In the context of Lambda Calculus EVERYTHING is a function (a algorithm, the input and the output are all functions).

Let's start with a very simple example in Scala.

def apply(f: Int => Int, x: Int): Int = f(x)

The first thing we need to adjust for the function to look more like Lambda Calculus is that since everything is a function, everything will be untyped.

def apply(f, x) = f(x)

A first abstraction Lambda Calculus does is that it treats all functions as anonymous (it doesn't give them explicit names).

(f, x) => f(x)

Another abstraction is that all functions are curried (only take 1 input and return 1 output).

f => (x => f(x))

Now if we rewrite the function using Lambda Calculus syntax instead of Scala. $$ \lambda f.\lambda x.(f x) $$

Definition

Given a set of variables VARS, a expression under lambda calculus can be:

variable	$ x $	$ x \in VARS $
function	$ \lambda x.e $	$ x \in VARS $, $ e $ is a $ \lambda $-expression
application	$ (e_1 \ e_2) $	$ e_1, e_2 $ are $ \lambda $-expressions

To evaluate $\lambda$-expressions, there are two types of reduction operations:

$\alpha$-conversion: given a expression: $ \lambda x.e $, you can rename all occurences of x in e with y (used for avoiding name collisions).
$\beta$-reduction: given a expression: $(\lambda x.body \ param)$, you can replace all occurences of x in body with param (We will denote this action with: $ body[x \ / \ param] $).

Be careful about the difference between $ E_1 = (\lambda x.e_1 \ e_2) $ and $ E_2 = e_1[x \ / \ e_2] $. The former denotes a expression made from a application between a function and a expression , while the latter is the expression obtained applying $ \beta $-reduction to the former. We say $ E_1 $ is reducible to $ E_2 $ ($ E_1 \Rightarrow E_2 $).

We say two expressions are equivalent if it is possible to get one of them from the other using $\alpha$-conversion and $\beta$-reductions.

If a expression cannot be reduced further using $ \beta $-reductions, we say the expression is in $ \beta $-normal form.

Free and bound variables

Take the following Scala snippet as an example:

def f(x: Int) = x + y

We can say that the second occurence of $ x $ is bounded by the $ x $ that appears as a function parameter. When we call the function, the occurence of $ x $ is replaced by the argument that was provided to $ f $. In contrast, $ y $ is a free variable.
This code might look weird, where does $ y $ come from? What does it do? Why would we use a variable that we don't instantiate (i.e. is not bound to anything)? Well, the snippet actually comes from a broader context:

def g(x: Int, y: Int) = {
  def f(x: Int) = x + y
  f(x * y)
}

In this new snippet we can see that all variables are bounded, and the free variable from before is bounded by the outer function, but only the free variable, notice that $ x $ is still bounded by the inner function, and the $ x $ parameter of $ g $ is ignored inside $ f $.
The importance of free variables is that only free occurences of a sub-expression can be bounded by the outer expression.

Translating to lambda calculus, when reducing $ \lambda x.e_1 \ e_2 $ to $ e_1[x \ / \ e_2] $, only free occurences of $ x $ in $ e_1 $ will be replaced by $ e_2 $.

More generally, we say that:

if all occurences of a variable in a expressions are bounded , the variable is said to be bounded
if one occurence of a variable in a expressions is free , the variable is said to be free

Exercise

7.1.1. For every variable occurence, mention if it's a free or a bounded occurence:

$ \lambda y.(\lambda x.x \ (x \ y)) $
$ \lambda x.(x \ \lambda y.(x \ y \ z)) \ (x \ \lambda y.x) $
$ \lambda f.((\lambda x.(f \ (x \ x))) \ (\lambda x.(f \ (x \ x)))) $

Reduction rules

Using what we learned from free and bounded variables, we can define a algorithm for $\beta$-reduction, given a expression $ e_1[x \ / \ e_2] $:

$ e_1 $	$ e_1[x \ / \ e_2] $	condition
$ x $	$ e_2 $
$ y $	$ y $	$ x \neq y $
$ E_1 \ E_2 $	$ E_1[x \ / \ e_2] \ E_2[x \ / \ e_2] $
$ \lambda x.e $	$ \lambda x.e $
$ \lambda y.e $	$ \lambda y.e[x \ / \ e_2] $	$ x \neq y $, $ y $ does not appear free in $ e_2 $
$ \lambda y.e $	$ \{\lambda z.e[y \ / \ z]\}[x \ / \ e_2] $	$ x \neq y $, appears free in $ e_2 $	( $ z $ is a new variable that is not free in $ e $ or $ e_2 $ )

Evaluation order

Q: If we have multiple redexes in a expression, which one do we evaluate?

A: We can evaluate any of them, and it is guaranteed by Church-Rosser theorem that if the expression is reducible, we will eventually get the same $ \beta $-normal form.

To not just randomly choose redexes, there exist reduction strategies, from which we will use the Normal Order and Applicative Order:

Normal Order evaluation consist of always reducing the leftmost, outermost redex (whenever possible, subsitute the arguments into the function body)
Applicative Order evaluation consist of always reducing the leftmost, innermost redex (always reduce the function argument before the function itself)

A expression of the form $ \lambda x.e_1 \ e_2 $ is also called a redex (reducible expression)

Exercise

7.1.2. Evaluate in both Normal Order and Applicative Order the following expressions:

$ \lambda x.\lambda y.(x \ y \ x) \ \lambda x.\lambda y.x \ (\lambda x.\lambda y.\lambda z.(x \ z \ y) \ \lambda x.\lambda y.y)$
$ \lambda x.y \ (\lambda x.(x \ x) \ \lambda x.(x \ x))$

Lambda calculus as a programming language (optional)

The Church-Turing thesis asserts that any computable function can be computed using lambda calculus (or Turing Machines or equivalent models).
For the curious, a series of additional exercises covering this topic can be found here: Lambda Calculus as a programming language.