====== Lab 7. Lambda Calculus ======
===== 7.0. What? Why? =====
**Lambda Calculus** is a universal model of computation (can be used to simulate any Turing Machine) based on function //abstraction// and //application//. It has a very simple semantic that can be used to study properties of computation. \\
The first thing to take note of is that **EVERYTHING** is a function (an algorithm, the input and the output are all functions). \\
Let's start with a very simple example in Scala.
def apply(f: Int => Int, x: Int): Int = f(x)
The first thing we need to adjust for the function to look more like Lambda Calculus is that since everything is a function, everything will be untyped.
def apply(f, x) = f(x)
An abstraction Lambda Calculus does is that it treats all functions as //anonymous// (it doesn't give them explicit names).
(f, x) => f(x)
Another abstraction is that all functions are //curried// (only take 1 input and return 1 output).
f => (x => f(x))
Now we can re-write the function using lambda calculus syntax (instead of Scala), and we will get a valid lambda expression. \\
$ \lambda f.\lambda x.(f \ x) $
==== Formal Definition ====
Given a set of variables **VARS**, an expression under lambda calculus can be: \\
| // variable // | $ x $ | $ x \in VARS $ |
| // function // | $ \lambda x.e $ | $ x \in VARS $, $ e $ is a $ \lambda $-expression |
| // application // | $ (e_1 \ e_2) $ | $ e_1, e_2 $ are $ \lambda $-expressions |
To evaluate $\lambda$-expressions, there are two types of reduction operations:
* **$\alpha$-conversion**: given a expression: $ \lambda x.e $, you can rename all occurences of //**x**// in //**e**// with //**y**// (used for avoiding **name collisions**).
* **$\beta$-reduction**: given a expression: $(\lambda x.body \ param)$, you can replace all occurences of //**x**// in //**body**// with //**param**// (We will denote this action with: $ body[x \ / \ param] $).
Be careful about the difference between $ E_1 = (\lambda x.e_1 \ e_2) $ **and** $ E_2 = e_1[x \ / \ e_2] $. The former denotes a expression made from a // application // between a // function // and a // expression //, while the latter is the // expression // obtained applying $ \beta $**-reduction** to the former. We say $ E_1 $ is reducible to $ E_2 $: ($ E_1 => E_2 $).
We say two // expressions // are equivalent if it is possible to get one of them from the other using **$\alpha$-conversion** and **$\beta$-reductions**.
If a // expression // cannot be reduced further using $ \beta $**-reductions**, we say the expression is in $ \beta $**-normal form**.
===== 7.1. Free and bound variables =====
Take the following Scala snippet as an example:
def add(x: Int) = x + y
We can say that the second occurence of $ x $ is // bounded // by the $ x $ that appears as a function parameter. When we call the function, the occurence of $ x $ is replaced by the argument that was provided to $ add $. In contrast, $ y $ is a // free // variable. \\
This code might look weird, where does $ y $ come from? What does it do? Why would we use a variable that we don't instantiate (i.e. is not bound to anything)? Well, the snippet actually comes from a broader context:
def add_all(x: List[Int], y: Int) = {
def add(x: Int) = x + y
x.map(add)
}
In this new snippet we can see that all variables are //bounded//, the free variable from before is // bounded // by the outer function, but only the // free // variable, notice that $ x $ is still bounded by the inner function, and the $ x $ parameter of $ add\_all $ (that is a ''List[Int]'') is 'invisible' inside $ add $. \\
The importance of // free // variables is that only // free occurences // of a sub-expression can be bounded by the outer expression. \\
\\
Translating to lambda calculus, when reducing $ (\lambda x.e_1 \ e_2) $ to $ e_1[x \ / \ e_2] $, only // free // occurences of $ x $ in $ e_1 $ will be replaced by $ e_2 $.
More generally, we say that:
* if all occurences of a variable in a expression are // bounded //, the variable is said to be // bounded // in that expression
* if one occurence of a variable in a expression is // free //, the variable is said to be // free // in that expression
\\
**Exercise** \\
**7.1.1. ** For every variable occurence, mention if it's a // free // or a // bounded // occurence:
- $ \lambda y.(\lambda x.x \ (x \ y)) $
- $ (\lambda x.(x \ \lambda y.((x \ y) \ z)) \ (x \ \lambda y.x)) $
- $ (\lambda f.(\lambda x.f \ (x \ x)) \ (\lambda x.f \ (x \ x))) $
Solutions: \\ \\
1. $ \lambda y.(\lambda x.x_1 \ (x_2 \ y_1)) $ \\
Bounded occurrences: $ x_1 $ (to $ \lambda x $), $ y_1 $ (to $ \lambda y $). \\
Free occurrences: $ x_2 $ (Not bounded by $ \lambda x.x_1 $) \\
In the sub-context of $ \lambda x.x_1 \ (x_2 \ y_1) $, $ y_1 $ is free. \\
\\
2. $ (\lambda x.(x_1 \ \lambda y_1.((x_2 \ y_2) \ z_1)) \ (x_3 \ \lambda y_3.x_4)) $ \\
Bounded occurrences: $ x_1, x_2 $ (to $ \lambda x $), $ y_2$ (to $ \lambda y_1 $). \\
Free occurrences: $ z_1, x_3, x_4 $. \\
In the sub-context of $ \lambda y_1.((x_2 \ y_2) \ z_1) $, $ x_2 $ is free. \\
\\
3. $ (\lambda f.(\lambda x_1.f_1 \ (x_2 \ x_3)) \ (\lambda x_4.f_2 \ (x_5 \ x_6))) $ \\
Bounded occurrences: $ f_1 $ (to $ \lambda f $). \\
Free occurrences: $ x_2, x_3, f_2, x_5, x_6 $. \\
===== 7.2. Reduction rules =====
Using what we learned from // free // and // bounded // variables, we can define a algorithm for $\beta$**-reduction**, given a expression $ e_1[x \ / \ e_2] $:
^ $ e_1 $ ^ $ e_1[x \ / \ e_2] $ ^ // condition // ^
| $ x $ | $ e_2 $ | |
| $ y $ | $ y $ | $ x \neq y $ |
| $ E_1 \ E_2 $ | $ E_1[x \ / \ e_2] \ E_2[x \ / \ e_2] $ | |
| $ \lambda x.e $ | $ \lambda x.e $ | |
| $ \lambda y.e $ | $ \lambda y.e[x \ / \ e_2] $ | $ x \neq y $, $ y $ does not appear // free // in $ e_2 $|
| $ \lambda y.e $ | $ \{\lambda z.e[y \ / \ z]\}[x \ / \ e_2] $ | $ x \neq y $, $ y $ appears // free // in $ e_2 $| ( $ z $ is a new variable that is not free in $ e $ or $ e_2 $ ) |
==== Evaluation order ====
**Q:** If we have multiple **redexes** in a expression, which one do we evaluate?
**A:** We can evaluate any of them, and it is guaranteed by [[https://en.wikipedia.org/wiki/Church%E2%80%93Rosser_theorem | Church-Rosser theorem]] that if the expression is reducible, we will eventually get the same $ \beta $**-normal form**.
To not just randomly choose **redexes**, there exist //reduction strategies//, from which we will use the **Normal Order** and **Applicative Order**: \\
* **Normal Order** evaluation consist of always reducing the //leftmost//, //outermost// **redex** (whenever possible, subsitute the arguments into the function body) \\
* **Applicative Order** evaluation consist of always reducing the //leftmost//, //innermost// **redex** (always reduce the function argument before the function itself) \\
A expression of the form $(\lambda x.e_1 \ e_2)$ is also called a **redex** (reducible expression)
**Exercise** \\
**7.2.1. ** Evaluate in both **Normal Order** and **Applicative Order** the following expressions:
- $ (\lambda x.\lambda y.\lambda z.((x \ z) \ y) \ \lambda x.\lambda y.x) $
- $ ((\lambda x.\lambda y.((x \ y) \ x) \ \lambda x.\lambda y.x) \ (\lambda x.\lambda y.\lambda z.((x \ z) \ y) \ \lambda x.\lambda y.y))$
- $ (\lambda x.y \ (\lambda x.(x \ x) \ \lambda x.(x \ x)))$
Solutions:
===== Lambda calculus as a programming language =====
The [[https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis | Church-Turing thesis]] asserts that any //computable// function can be computed using lambda calculus (or Turing Machines or equivalent models). \\
How can this be? Everything in Lambda Calculus is a function, there are no numbers to compute //stuff// with. Well, while there are not the numbers we are used to, we can define **higher-order functions** that are analogs for concepts we are familiar with and use them instead. \\
The representations we are going to present further are also called **Church encodings**, because they were first used by Alonzo Church, the inventor of Lambda Calculus.
==== 7.3. Booleans ====
We can encode boolean values **TRUE** and **FALSE** in lambda calculus as functions that take 2 values, **x** and **y**, and return the first (for **TRUE**) or second (for **FALSE**) value. \\
$ TRUE = \lambda x.\lambda y.x$ \\
$ FALSE = \lambda x.\lambda y.y$ \\
As we defined it, **TRUE** is sometimes called the **K**-Combinator (or //Kestrel//), and **FALSE** the **KI**-Combinator (or //Kite//). \\
{{:pp:2024:kestrel.jpg?nolink&200|}}
{{:pp:2024:kite.jpg?nolink&200|}}
Some common operation on booleans (that were discussed during the lecture) are: \\
\\
$ AND = \lambda x.\lambda y.((x \ y) \ x) $ \\
$ OR = \lambda x.\lambda y.((x \ x) \ y) $ \\
$ NOT = \lambda x.((x \ FALSE) \ TRUE) $ \\
$ IF = \lambda c.\lambda t.\lambda e.((c t) e) $ \\
**NOT** can also be written as: \\
\\
$ NOT = \lambda x.\lambda a.\lambda b.((x \ b) \ a) $ \\
\\
You can convince yourself that this works by evaluating $ NOT \ TRUE $ and $ NOT \ FALSE $. This way of writting **NOT** is also called the **C**-Combinator (or //Cardinal//). \\
{{:pp:2024:cardinal.jpg?nolink&200|}}
----
**Exercises** \\
**7.3.1.** Define the $ XOR $ operations over booleans.
**7.3.2.** Define the $ NAND $ operations over booleans.
**7.3.3.** Define the $ NOR $ operations over booleans.
Solutions:
https://aartaka.me/lambda-3.html
\\
7.3.1:
V1: xor = $ \lambda a.\lambda b.a (not b) b $
V2: XOR = $ \lambda a.\lambda b.((OR (AND (NOT a)) b)) (AND (NOT b)) a))$
\\
7.3.2:
V1: nand = $ \lambda a.\lambda b.a (b false true) true $
V2: NAND = $ \lambda a.\lambda b.(NOT (AND a b)) $
\\
7.3.3:
V1: nor = $ \lambda a.\lambda b.a (NOT b) false $
V2: NOR = $ $ \lambda a.\lambda b.(NOT (OR a b)) $
==== Pairs - Lecture Reminder ====
We can also encode // data structures //. We will only look at one of the simpler ones, the **pair**. \\
A pair encapsulates two variables together, that we can later access using $ FIRST $ and $ SECOND $ . \\
\\
$ PAIR = \lambda a.\lambda b.\lambda z.((z \ a )\ b) $ \\
$ FIRST = \lambda p.(p \ TRUE) $ \\
$ SECOND = \lambda p.(p \ FALSE) $ \\
The $ PAIR $ higher-order function we defined is also called the **V**-Combinator (or //Vireo//). \\
{{:pp:2024:vireo.jpg?nolink&200|}}
==== 7.4. Natural Numbers - Church numerals ====
Church numerals represent natural numbers as **higher-order functions**. Under this representation, the number //**n**// is a function that maps **f** to its **n-fold composition**. \\
\\
$ N0 = \lambda f.\lambda x. x $ \\
$ N1 = \lambda f.\lambda x. (f \ x) $ \\
$ N2 = \lambda f.\lambda x. (f \ (f \ x)) $ \\
...
Does **N0** look familiar? It's the same as **FALSE** if you rename the variables (using $\alpha$-reduction).
You can also define operation on church numerals, some (that were discussed during the lecture) are: \\
\\
$ SUCC = \lambda n.\lambda f.\lambda x.(f \ ((n \ f) \ x)) $ \\
$ ISZERO = \lambda n.((n \ \lambda x.FALSE) \ TRUE) $ \\
$ ADD = \lambda n.\lambda m.\lambda f.\lambda x.((n \ f) \ ((m \ f) \ x)) $ \\
\\
----
**Exercises** \\
**7.4.1.** Define multiplication under church numerals: $ MULT = \lambda n.\lambda m. \ ... $ (**Hint:** you can do it without the **Y**-Combinator)
**7.4.2.** Define exponentiation under church numerals: $ EXP = \lambda n.\lambda m. \ ... $
**7.4.3. (*)** Define the predecessor operator, that takes a number and returns the number prior to it. What's the predecessor of 0? Evaluate $ (PRED \ N0) $.
**7.4.4.** Define substraction under church numerals: $ SUB = \lambda n.\lambda m. \ ... $ (**Hint**: use $ PRED $). What happens if you try to substract a bigger number from a smaller one? Evaluate $ (SUB \ N1 \ N2 )$.
**7.4.5.** Define $ LEQ $ (less or equal). $ LEQ \ n \ m $ should return **TRUE** if $ n \leq m $ and **FALSE** if $ n > m $.
**7.4.6.** Define $ EQ $ (equality). $ EQ \ n \ m $ should return **TRUE** if $ n = m $ and **FALSE** otherwise.
7.4.3.
Let's start with defining a // shift-and-increment // operator: \\
$ \phi' = \lambda x.(PAIR \ x \ (SUCC \ x)) $ \\
\\
This takes a number $ n $, and returns a pair made up of the number and it's succesor ( $ n $ , $ (SUCC \ n) $ ). \\
\\
To make this function be able to be iterated multiple times (on itself), we make the input another pair, where the second value is the 'real' input: \\
$ \phi = \lambda p.((PAIR \ (SECOND \ p)) \ (SUCC \ (SECOND \ p))) $ \\
\\
This takes a pair ( $ n $, $ (SUCC \ n) $) and returns another pair ($ (SUCC \ n) $, $ (SUCC \ (SUCC \ n)) $
\\
Now we can just iterate this **n** times starting with $ N0 $, and we get a pair ($ n - 1 $, $ n $), where the first value is our predecesor: \\
$ PRED = \lambda n.(FIRST \ ((n \ \phi) \ (PAIR \ N0 \ N0))) $ \\
\\
An alternative solution, that uses a value container is the following (unfortunately, we will not explain this in further detail here): \\
$ PRED = \lambda n.\lambda f.\lambda x.(((n \ (\lambda g.\lambda h.(h \ (g \ f)))) \ \lambda u.x) \ \lambda v.v) $ \\
\\
==== 7.5. Recursion and the Sage Bird ====
In lambda calculus, recursion is achieved using the fixed-point combinator (or **Y** combinator, // "Why" // bird or //Sage bird//). A fixed-point combinator is a **higher-order** function that returns some fixed point of it's argument function (**x** is a fixed pointed for a function **f** if $ f(x) = x $). That means: $ f \ (fix \ f) = fix \ f $ . And by repeated application: $ fix \ f = f \ (f \ (... f \ (fix \ f)...)) $ \\
\\
The **Y**-combinator in lambda calculus looks like this: \\
\\
$ FIX = \lambda f.(\lambda x.(f \ (x \ x)) \ \lambda x.(f \ (x \ x))) $
----
**Exercises** \\
**7.5.1. (*)** Using the **Y**-Combinator, define a function that computes the factorial of a number **n**.
**7.5.2. (*)** Using the **Y**-Combinator, define a function $ FIB $ that computes the **n**-th fibonacci number.
Solutions:
\\ \\
7.5.1 \\
FACT = $ \lambda n. Y FACT' n $ \\
FACT' = $ \lambda rec. \lambda n. IF (ISZERO n) TRUE (MULT n (rec (PRED n)) -- if (n == 0) return 1 else n * rec(n - 1) $ \\