Table of Contents

Types in functional programming

Typing in programming languages

Strong vs weak typing

Consider the following example from Javascript, where the operator + is applied on arguments of different types. The result is given below:

[] + []  =  ""  
[] + {}  = "[object]"
{} + []  = 0
{} + {}  = null

where [] is the empty array and {} is the empty object. The result is quite surprising, and unpredictable. To explain it, we need to look at the implementation of the Javascript interpreter. In Javascript, there is a distinction between primitive and non-primitive values. Arrays (like []) and objects (like {}) are considered non-primitive. Since the operation + is performed on primitive values, Javascript attempts to convert the operands to primitive. A pseudocode for this is shown below:

     Convert(value) {
        if (value.valueOf() is a primitive) return it          //conversion to integer works?
        else if (value.toString() is a primitive) return it    //else convert to string, if possible 
             else error
     }

The conversion of [] to string yields the empty string, while the conversion of {} to string yields “[object]”. This explains the first two results.

For the latter, we must know that {} can also be interpreted as a code-block, and this is the case here. Hence, the Javascript interpreter sees:

 
+ []
+ {}

where + is interpreted as a unary operator. JavaScript has such an implementation overloading for +. Without going into more details, + x behaves like a conversion to Number for x. Thus, [] converted to Number is 0, while {} converted to Number is NaN (not a number).

Morale

The Javascript treatment of + is unimportant by itself, and it has some advantages for the programmer (although our example shows otherwise). However, we can note that, by allowing any kind/type of operand for +:

In programming language design, there is a fundamental tension between expressiveness and typing. Typing is a means for enforcing constraints on what is deemed as a correct program. It may be the case that, conceptually correct programs may not be accepted as correct from a typing perspective (this situation does seldom occur in Haskell).

A strongly-typed programming language, is more coercive with respect to typing constraints. In functional programming:

are considered strongly-typed. In imperative / OOP programming, the languages:

are considered strongly-typed.

A weakly-typed programming language is more relaxed w.r.t. typing constraints. For instance, in functional programming:

are considered weakly-typed. In imperative / OOP programming, the languages:

(and especially the latter) are considered weakly-typed. In the latter languages, types are usually reduced to primitive constructs (programmers cannot create new types), or the type construction procedure is very simplistic. For instance, in Racket (formerly known as Scheme), which weakly-typed:

However, type verification is not absent in weakly-typed languages, including Racket/Scheme. For instance, the call (+ 1 '()) will produce an error since the plus operator is called on values with invalid types.

We shall discuss Scheme/Racket in more detail later. It is worth noting that, in Racket there exists extensions (Typed Racket) which allow programmers to define and compose types to some extent.

The weakly-typed vs strongly-typed classification is not rigid and is subject to debate and discussion. There is no objective right-answer. For instance, here, the language C is viewed as weakly-typed. We illustrate a small motivating example:

int f (int x) {
    if (x != 0)
        return 1;
    return malloc(100);
}

In principle, the function f can return an integer, or a pointer to any object (of any type), and this is allowed by the compiler (which does issue a warning). Compared to, e.g. Java, this makes the C type system more relaxed.

There are valid arguments for considering C as strongly-typed and, as said before, there is no right answer.

Compile-time vs runtime typing

This classification is done w.r.t. the moment when type inference occurs:

The former is also called static typing, while the latter - dynamic typing. In the literature, static and dynamic typing are also used with other meanings, hence, here, we prefer the terms compile-time and runtime.

The imperative/OOP languages:

perform compile-time type checking, as well as the functional languages:

The imperative/OOP languages:

perform runtime type checking, as well as the functional languages:

Compile-time type checking is preferred for strongly-typed languages: the complexity of type verification is delegated to the compiler. Conversely, in weakly-typed languages, type verification is simpler, hence it can be performed by the interpreter, at runtime. Sometimes, a compiler may be absent. This is not a golden rule but merely an observation.

While runtime type checking is simpler to deploy, it has the disadvantage of not capturing typing bugs. Consider the following program in Racket:

(define f (lambda (x) (if x 1 (+ 1 '()))))
(f #t)

The function receives a value x which must be a boolean. In the program above there is no error, even though (+ 1 '()) is an incorrectly-typed function call. However, in the execution of the program (i.e. (f #t)), the else branch of the function is not reached, hence no typing verification is performed.

Hence, runtime type checking only catches bugs on the current program execution trace.

Typing in Haskell

Type inference in Haskell

Haskell implements the Hindley-Milner type inference algorithm. In what follows, we present a simplified, and more easy-to-follow, but incomplete algorithm which serves as an illustration for the main concepts underlying the original one.

Intro

Consider the following expressions, and their types:

\x -> x + 1 :: Integer -> Integer
\x -> if x then 0 else 1 :: Bool -> Integer
zipWith (:) :: [a] -> [[a]] -> [[a]]
\f -> (f 1) + (f 2) :: (Integer -> Integer) -> Integer

We can see via the above example that types are constructed according to the following grammar:

type ::= <const_type> | <type_var> | (<type>) | type -> type | [<type>]

This grammar only tells half the story regarding Haskell typing, however, for the purposes of this lecture, this view suffices. According to the above grammar, types can be:

Expression trees

We assume each Haskell expression is constructed via the following construction rules:

We note that many Haskell definitions can be seen as such. For instance:

g f = (f 1) + 1
<code>
 
can be seen as:
 
<code haskell>
g = \f -> (f 1) + 1

Hence, we can take any Haskell expression, and construct a tree, in which each node represents a construction rule, and children represent sub-expressions:

For example, consider the expression tree for the function g shown previously (we use tabs to illustrate parent/child relationship):

\f -> (f 1) + 1
  f
  (f 1) + 1
    (+)
    (f 1)
       f 
       1
    1

In what follows, we shall use expression trees to perform type inference.

Typing rules

We introduce the following typing rules:

Rule (TVar)

If v is bound to a constant expression e of type ct, then v :: ct

Rule (TFun)

If x :: t1 and e :: t2 then \x → e :: t1 → t2

Rule (TApp)

If f :: t1 → t2 and e :: t1 then (f e) :: t2

The above rule can be naturally generalised:

If f :: t1 → t2 → … → tn → t and e1 :: t1, …, en :: tn then (f e1 … en) :: t

In what follows, we will use these rules to make judgements on our types. These rules have a twofold usage:

Type inference stage 1: Expression tree construction

Type inference for an expression e can be seen as having two stages. In the first stage, we:

We illustrate the first stage on the previous definition of g:

\f -> (f 1) + 1 :: ?
  f :: tf (here we introduce tf as the type of f. This is a type hypothesis)
  (f 1) + 1 :: ?
    (+) :: ?
    (f 1) :: ?
       f :: t1 -> t2 (this is another hypothesis, stemming from the fact that f is applied on 1)
       1 :: ?
    1 :: ?

Type inference stage 2: Rule application

In this stage, we start from the previously-build tree, and:

This is equivalent to a bottom-up tree traversal: We start from the leaves, and progress to the root (i.e. the expression to be typed).

Without delving into details, type unification is an important ingredient, because it allows us to infer the most general type of an expression. Consider the following Haskell expression: \f x → (f x,f 1), which defines a function that takes another function f, a value x and returns a pair: the first element of the pair is the application f x, while the second - f 1:

The unification process combines the information collected so far:

The final type for the expression is: (Integer → b) → Integer → (b, b)

We illustrate the second stage of the type inference on the same example:

\f -> (f 1) + 1 :: tf -> Integer
  f :: tf (via (TFun))
  (f 1) + 1 :: Integer (via (TApp))
    (+) :: Integer -> Integer -> Integer (this we know from Prelude, after the type synthesis of (+), also t2 must unify with Integer)
    (f 1) :: t2 (via (TApp); also, t1 must unify with Integer)
       f :: t1 -> t2
       1 :: Integer (via (TVar), from Prelude)
    1 :: Integer (via (TVar), from Prelude)

The (pseudo)-algorithmic procedure concludes with the following answer:

After unification, the result is shown to the programmer: g :: (Integer → Integer) → Integer

Exercises. Find the type of the following expressions, by applying the type synthesis pseudo-algorithm: