Polymorphism in functional languages

Translated literally, the term polymorphism means multiple shapes. Polymorphism in programming languages is a powerful modularisation tool, which allows programmers to:

  • abstract from implementation details (in the case of ad-hoc polymorphism)
  • define a unique implementation for a range of types (in the case of parametric polymorphism / genericity)

Polymorphism is a mechanism supported by virtually any strongly-typed programming language, including functional and object-oriented ones.

Ad-hoc polymorphism

In an Object-Oriented language (say, Java), ad-hoc polymorphism allows the programmer to (dynamically) select the implementation of a function, based on the type(s) of the variable(s) on which the former is applied.

Consider the following class definitions:

class Animal {
  public void talk (){
    System.out.println("I am an Animal");
  }
}
 
class Bird extends Animal {
  public void talk (){
    System.out.println("Cip-cirip");
  }
}

Finally, consider the following code:

 Animal [] v = new Animal [2];
    v[0] = new Animal();
    v[1] = new Bird();
    for (Animal a:v)
      a.talk();

whose output is:

I am Animal
Cip-cirip

The example illustrates ad-hoc polymorphism, or method overriding:

  • the talk method from the class Animal has been overridden in the class Bird which extends Animal.
  • moreover, establishing the implementation of a.talk() is done at runtime, based on the actual (here Bird) not declared (Animal) type of a.

Method overriding should not be confused with method overloading which means that the same function name can be used for different function implementations, each having a different signature. The following code, which continues the previous example, illustrates overloading:

  public void listen_to(Animal a){
    System.out.println("An animal is listening:");
    a.talk();
  }
  public void listen_to(Bird b){
    System.out.println("A bird is listening:");
    b.talk();
  }

The function listen_to has been overloaded. Now, consider the calls:

listen_to(v[0]);
listen_to(v[1]); 
listen_to(new Bird());

The output is:

An animal is listening:
I am an Animal
An animal is listening:
Cip-cirip
A bird is listening:
Cip-cirip

The interesting call is listen_to(v[1]). Note that the code for listen_to(Animal a) has been called (which outputs An animal is listening:, although the actual type of v[1] is Bird, as shown by the next line of the output Cip-cirip. The example illustrates an important point:

  • method overloading is performed at compile-time, based on the declared not the actual type of the parameters.

To illustrate this design decision, consider yet another example:

class Who implements Interface1, Interface2 {}
 
interface Interface1 {}
 
interface Interface2 {}
 
public class X {
    private static void method(Object o)     {}
    private static void method(Interface1 i) {}
    private static void method(Interface2 i) {}
}

Suppose for a moment that overloading relies on the actual type of an object, instead of the declared one. Now consider the following code:

   Object o = new Who();
   method(o);

Since Who implements both Interface1 and Interface2, the compiler cannot make a decision about o's actual type.

Question: What is the actual behaviour of the above code?

One final note regarding ad-hoc polymorphism is that the function signature cannot differ only in the returned type. This holds in both Java, and Cpp and Scala. To see why this is the case, consider the following definitions:

class Parent {}
class Child extends Parent {}

and the methods:

public Parent method() {}
public Child method() {}

as well as the invocation:

Object o = new Parent ();
o = method();

The compiler cannot decide which implementation should be called in this particular case.

Genericity

While ad-hoc polymorphism (overriding) allows for several different implementations to be defined under the same function name, genericity in Java (and parametric polymorphism in general), allows for:

  • a unique implementation to be defined over range of types

For instance, in:

static <T> int count (List<T> l){
    int i = 0;
    for (T e:l)
        i++;
    return i;
}

the method count is defined w.r.t. lists containing any type of element (T). Technically, genericity in Java is a mechanism for ensuring cast-control and is not a part of Java's type system. For instance, in the compilation phase, the above code is translated to:

static int count (List l){
    int i = 0;
    for (Object e:l)
        i++;
    return i;
}

This process is called type erasure. Consider another example, before type erasure:

List<Animal> l = ...
Animal e = l.get(i)

and after:

List l = ...
Animal e = (Animal)l.get(i)

Thus, type-safety is achieved via automatic casts.

Others

There are other types of polymorphism which may appear in the literature, e.g. subtype polymorphism which simply means that a variable v of type T is allowed to refer to an object of any type derived from T. Thus, subtype polymorphism is a basic OOP feature.

Parametric polymorphism

Parametric polymorphism is a fundamental trait of typed functional programming in general, and Haskell in particular. It manifests via the presence of type variables which stand for any type. Numerous functions defined so far are parametrically polymorphic:

  • foldl
  • foldr
  • map
  • zipWith

they define unique implementations which are independent of:

  • the type of the contained elements of a list
  • the function type which is applied on elements from a list
  • etc.

Unlike Java, in Haskell, parametric polymorphism is an intrinsic (and key) feature of the type-system. To explore it in more depth, we start with a discussion regarding polymorphic types:

Polymorphic types

We illustrate Haskell polymorphic types by constructing polymorphic lists precisely in the same way they are defined in Prelude:

data List a = Void | Cons a (List a)

compared to the monomorphic lists defined in the previous lecture, we observe:

  • the newly defined type is List a, where a is a type-variable
  • Void :: (List a) which means that Void is a polymorphic value (i.e. Void can be the empty list for list of integers or lists of strings etc.)
  • Cons :: a → (List a) → (List a), i.e. Cons takes a value of type a, a list of type (List a) (not [a]) and returns a list of type (List a)

We also illustrate a recursive conversion function, as an example:

listConvert :: (List a) → [a] listConvert Void = [] listConvert (Cons h t) = h:(listConvert t)

Pairs (and tuples in general) are a very useful data structure, and they can be defined as follows:

data Pair a b = Pair a b

this definition requires more care in reading it:

  • data Pair a b defines a polymorphic type, where two independent type variables occur: the type of the first element of the pair, and that of the second. These two types need not coincide;
  • Pair :: a → b → Pair a b is the unique data constructor for pairs: it takes an element of type a, one of type b and produces an element of type Pair a b.

As before, we write an illustrative conversion function:

pairconvert :: Pair a b -> (a,b)
pairconvert (Pair x y) = (x,y)

The programmer should not mistake the keyword Pair from the type Pair a b, with the data constructor Pair :: a → b → Pair a b. Similar to the language C, where two namespaces exist: one for structures and one for types (with the typedef instruction to create new types), here we also have two namespaces:

  • one for types (where Pair has been defined via the l.h.s. of the = in the data definition)
  • one for values (and functions) (where the data constructor Pair has been defined)

We also define the polymorphic tree datatype:

data Tree a = Leaf | Node (Tree a) a (Tree a)

Type constructors

Let us recall the syntax for types, as presented in the previous lecture. It mainly consisted of type-variables (anything), function-types as well as list types. To this list we may add any other type introduced via data.

However, there is a more uniform and elegant way for describing these types. This approach relies on a functional approach to type construction:

  • We require special functions called type constructors, which take types as parameter and return types
  • To construct new types, we apply type constructors on type expressions (e.g. monomorphic types or type variables).

The List type constructor

In our previous data List a = … definition:

  • List is a type constructor
  • Since - conceptually, List is also a function, it must have a type. The type of a type constructor is called kind in Haskell. Thus, the kind of List is written as: List :: * ⇒ *, which reads: List receives a type and returns a type
  • The polymorphic type List a is actually a type function application. The function is List and the parameter is the variable a.
  • Similarly, the monomorphic type List Integer (or similarly [Integer]) is constructed as an application of List on the monomporphic type Integer.

Exercise: Describe the construction of the following types. For what do they stand?

  • [(List a)]
  • (List [a])

The Pair type constructor

  • Pair :: * ⇒ * ⇒ * is a type constructor with kind * ⇒ * ⇒ *. It takes two types and produces a type.
  • The type Pair a Integer is polymorphic and represents the type of any pair whose second component is an integer.

With this observation, we can improve our syntax for types, as follows:

<type> ::= <const_type> | <type_var> | <type_constructor_application>
<type_constructor_application> ::= <type_const_1> <type> | <type_const_2> <type> <type> | ...

where:

  • <type_const_1> is any type constructor having kind * ⇒ *
  • <type_const_2> is any type constructor having kind * ⇒ * ⇒ *
  • etc.

To conclude, we observe that the function type is also constructed via the application of the type constructor:

  • (→) :: * ⇒ * ⇒ *

on specific types or type expressions.

Ad-hoc polymorphism

Ad-hoc polymorphism is necessary in typed functional languages, and we illustrate it via a few examples:

  • to display an object the interpreter is calling the function show :: a → String which takes an arbitrary type and converts it to a String. Naturally, show requires type-dependent implemementations
  • similarly, the + operation has different implementations for Integers, Floats, and may be extended for other objects as well.

Towards type-classes

Consider the types Nat and List a defined in previous lectures. To makes objects of type Nat or List a showable, we require functions of signature Nat → String and List a → String, respectively. We define them below:

showNat :: Nat -> String
showNat =
	let c Zero = 0
	    c (Suc x) = 1 + (c x)
	    in show . c
 
showList :: (List a) -> String
showList = lfoldr (\x y -> (show x)++":"++y) "[]"

The implementation of showList relies on lfoldr :: (a → b → b) → b → List a → b. Also, written in Haskell as-is, showList has problems regarding the call (show x). Consider that x::(List Nat) or that x :: Nat. Depending on x's type, we need to call different show functions. What is obvious already is that we need a single function name (e.g. show) which should have type-dependent implementations.

An attempt to solve this issue is by introducing a new type:

data Showable a = C1 Nat | C2 (List a)
 
show :: (Showable a) -> String
show (C1 x) = showNat x
show (C2 x) = showList y 

An object of type Showable a indicates a value which can be displayed. For each showable type, we define separate construction rules (C1 resp. C2). The above code still has a problem:

  • showList :: (List a) → String, however, to be able to call (show x), x must be showable, hence x::Showable a

To solve this issue, we modify the signature of showList:

showList :: (List (Showable a)) -> String

as well as the definition of Showable a:

data Showable a = C1 Nat | C2 (List (Showable a))

Our approach to handling ad-hoc polymorphism suffers from a single drawback:

  • it relies on type-packing. The programmer needs to handle both user-defined values (e.g. Zero :: Nat), as well as showable values (e.g. C1 Zero :: Showable a); we have two, type-dependent representations for the same object.

Type-classes

To solve the above issue, ad-hoc polymorphism is implemented in Haskell via type-classes, which are conceptually different from classes in OOP. In short:

  • a type-class describes a collection of types, which can be defined by the user
  • type-classes also contain function signatures which are traits specific to each type in the type-class
  • types can be enrolled in type-classes by the user
  • relationships between type-classes (e.g. inclusion) can be defined by the user.

We illustrate all the above by introducing the definition for the type-class Show:

class Show a where
   show :: a -> String

In the above definition, a is an arbitrary type which is enrolled in class Show. Any such type supports the function show, which is defined as part of the type-class. The following code enrols our previous Nat type in class Show, hence making naturals showable:

instance Show Nat where
    show = 
      let convert Zero = 0
          convert (Succ n) = 1 + (convert n)
      in show . convert

In our implementation, the type of show . convert is Nat → String, since convert :: Nat → Integer. This example also shows ad-hoc polymorphism in action. In the above expression (the functional composition), the general type show :: (Show a) ⇒ a → String of show becomes via unification ::Integer → String. Thus, the compiler knows to call the integer implementation of show, which is part of Prelude.

Also, note the interpretation of the type signature: show :: (Show a) ⇒ a → String which tells us that:

  • show :: a → String where a must be enrolled in the type-class Show

We continue by enrolling List a in class Show. Recall that, to be able to show lists, the elements from the list need to be showable. Hence, the enrollment is:

instance (Show a) => Show (List a) where
    show Void = "[]"
    show (Cons h t) = (show h)++":"++(show t)

Finally, we illustrate another kind of enrollment. It spawns from the observation that both lists and trees support map operations, which have very similar behaviour:

lmap :: (a -> b) -> (List a) -> (List b)
tmap :: (a -> b) -> (Tree a) -> (Tree b)

We can define the class of mappable types, which contains a fmap operation. In Haskell, this class is called Functor. A tentative definition class Functor t where raises the question regarding who is t, such that all constraints from the map signatures are preserved:

  • map transforms containers of a kind (e.g. Lists) in containers of the same kind
  • if the mapped transformation is (a → b), then the first container must have elements of type a while the second - of type b.

The solution is:

class Functor t where
   fmap :: (a -> b) -> t a -> t b

where t is a type-constructor with kind t :: * ⇒ *. Thus, we have the following enrollments:

instance Functor List where
  fmap = ...
 
instance Functor Tree where
  fmap = ...