✎ pp:polymorphism [books]

This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong.
====== Polymorphism in functional languages ======

===== Polymorphism in Object-Oriented languages =====

Translated literally, the term **polymorphism** means **multiple shapes**. Polymorphism in programming languages is a powerful modularisation tool, which allows programmers to:
  * **abstract from implementation details** (in the case of ad-hoc polymorphism)
  * **define a unique implementation for a range of types** (in the case of parametric polymorphism / genericity)

Polymorphism is a mechanism supported by virtually any strongly-typed programming language, including functional and object-oriented ones.

==== Ad-hoc polymorphism ====

In an Object-Oriented language (say, Java), **ad-hoc polymorphism** allows the programmer to //(dynamically) select the implementation of a function, based on the type(s) of the variable(s) on which the former is applied//.

Consider the following class definitions:
<code java>
class Animal {
  public void talk (){
    System.out.println("I am an Animal");
  }
}

class Bird extends Animal {
  public void talk (){
    System.out.println("Cip-cirip");
  }
}
</code>

Finally, consider the following code:
<code java>
 Animal [] v = new Animal [2];
    v[0] = new Animal();
    v[1] = new Bird();
    for (Animal a:v)
      a.talk();
</code>

whose output is:
<code>
I am Animal
Cip-cirip
</code>

The example illustrates **ad-hoc polymorphism**, or **method overriding**:
  * the ''talk'' method from the class ''Animal'' has been overridden in the class ''Bird'' which extends ''Animal''.
  * moreover, establishing the implementation of ''a.talk()'' is done **at runtime**, based on the **actual** (here ''Bird'') **not declared** (''Animal'') type of ''a''.

Method overriding should not be confused with **method overloading** which means that the same function name can be used for different function implementations, each having a different signature. The following code, which continues the previous example, illustrates **overloading**:

<code java>
  public void listen_to(Animal a){
    System.out.println("An animal is listening:");
    a.talk();
  }
  public void listen_to(Bird b){
    System.out.println("A bird is listening:");
    b.talk();
  }
</code>
The function ''listen_to'' has been overloaded. Now, consider the calls:
<code java>
listen_to(v[0]);
listen_to(v[1]); 
listen_to(new Bird());
</code>

The output is:
<code>
An animal is listening:
I am an Animal
An animal is listening:
Cip-cirip
A bird is listening:
Cip-cirip
</code>

The interesting call is ''listen_to(v[1])''. Note that the code for ''listen_to(Animal a)'' has been called (which outputs ''An animal is listening:'', although the actual type of ''v[1]'' is ''Bird'', as shown by the next line of the output ''Cip-cirip''. The example illustrates an important point:

  * **method overloading** is performed at **compile-time**, based on the **declared** not the actual type of the parameters.

To illustrate this design decision, consider yet another example:

<code java>
class Who implements Interface1, Interface2 {}

interface Interface1 {}

interface Interface2 {}

public class X {
    private static void method(Object o)     {}
    private static void method(Interface1 i) {}
    private static void method(Interface2 i) {}
}
</code>

Suppose for a moment that overloading relies on the actual type of an object, instead of the declared one. Now consider the following code:
<code java>
   Object o = new Who();
   method(o);
</code>

Since ''Who'' implements both ''Interface1'' and ''Interface2'', the compiler cannot make a decision about ''o'''s actual type.

**Question:** What is the actual behaviour of the above code?

One final note regarding ad-hoc polymorphism is that the function signature **cannot differ only in the returned type**. This holds in both Java, and Cpp and Scala. To see why this is the case, consider the following definitions:

<code java>
class Parent {}
class Child extends Parent {}
</code>

and the methods:

<code java>
public Parent method() {}
public Child method() {}
</code>

as well as the invocation:

<code java>
Object o = new Parent ();
o = method();
</code>

The compiler cannot decide which implementation should be called in this particular case.

==== Genericity ====

While ad-hoc polymorphism (overriding) allows for **several different implementations** to be defined **under the same function name**, **genericity** in Java (and **parametric polymorphism** in general), allows for:
  * **a unique implementation to be defined over range of types**

For instance, in:
<code java>
static <T> int count (List<T> l){
    int i = 0;
    for (T e:l)
        i++;
    return i;
}
</code>

the method ''count'' is defined w.r.t. lists containing any type of element (''T''). Technically, **genericity** in Java is a mechanism for ensuring //**cast-control**// and is not a part of Java's type system. For instance, in the compilation phase, the above code is translated to:

<code java>
static int count (List l){
    int i = 0;
    for (Object e:l)
        i++;
    return i;
}
</code>

This process is called **type erasure**. Consider another example, before type erasure:
<code java>
List<Animal> l = ...
Animal e = l.get(i)
</code>

and after:
<code java>
List l = ...
Animal e = (Animal)l.get(i)
</code>

Thus, type-safety is achieved via automatic casts.

==== Others ====

There are other types of polymorphism which may appear in the literature, e.g. **subtype polymorphism** which simply means that **a variable ''v'' of type ''T'' is allowed to refer to an object of //any type derived from ''T''//**. Thus, subtype polymorphism is a basic OOP feature.

===== Polymorphism in Haskell =====

==== Parametric polymorphism ====

Parametric polymorphism is a fundamental trait of typed functional programming in general, and Haskell in particular. It manifests via the presence of **type variables** which stand for **any type**. Numerous functions defined so far are parametrically polymorphic:
  * foldl
  * foldr
  * map
  * zipWith

they define **unique** implementations which **are independent** of:
  * the type of the contained elements of a list
  * the function type which is applied on elements from a list
  * etc.

Unlike Java, in Haskell, parametric polymorphism is an intrinsic (and key) feature of the type-system. To explore it in more depth, we start with a discussion regarding **polymorphic types**:

==== Polymorphic types ====

We illustrate Haskell polymorphic types by constructing **polymorphic lists** precisely in the same way they are defined in Prelude:

<code haskell>
data List a = Void | Cons a (List a)
</code>

compared to the **monomorphic lists** defined in the previous lecture, we observe:
  * the newly defined type is ''List a'', where ''a'' is a **type-variable**
  * ''Void :: (List a)'' which means that ''Void'' is a **polymorphic value** (i.e. ''Void'' can be the empty list for list of integers or lists of strings etc.)
  * ''Cons :: a -> (List a) -> (List a)'', i.e. ''Cons'' takes a value of type ''a'', a list of type ''(List a)'' (not ''[a]'') and returns a list of type ''(List a)''

We also illustrate a recursive conversion function, as an example:

listConvert :: (List a) -> [a]
listConvert Void = []
listConvert (Cons h t) = h:(listConvert t)

Pairs (and tuples in general) are a very useful data structure, and they can be defined as follows:

<code haskell>
data Pair a b = Pair a b
</code>

this definition requires more care in reading it:
  * ''data Pair a b'' defines a polymorphic type, where **two independent type variables** occur: the type of the first element of the pair, and that of the second. These two types need not coincide;
  * ''Pair :: a -> b -> Pair a b'' is the unique data constructor for pairs: it takes an element of type ''a'', one of type ''b'' and produces an element of type ''Pair a b''.

As before, we write an illustrative conversion function:
<code haskell>
pairconvert :: Pair a b -> (a,b)
pairconvert (Pair x y) = (x,y)
</code>

The programmer should not mistake the keyword ''Pair'' from the type ''Pair a b'', with the data constructor ''Pair :: a -> b -> Pair a b''. Similar to the language C, where two namespaces exist: one for structures and one for types (with the ''typedef'' instruction to create new types), here we also have two //namespaces//:
  * one for **types** (where ''Pair'' has been defined via the l.h.s. of the ''='' in the ''data'' definition)
  * one for **values (and functions)** (where the data constructor ''Pair'' has been defined)

We also define the polymorphic tree datatype:

<code haskell>
data Tree a = Leaf | Node (Tree a) a (Tree a)
</code>

=== Type constructors ===

Let us recall the **syntax** for types, as presented in the previous lecture. It mainly consisted of type-variables (anything), //function-types// as well as //list types//. To this list we may add any other type introduced via ''data''.

However, there is a more uniform and elegant way for describing these types. This approach relies on a **functional approach to type construction**:
  * We require special functions called **type constructors**, which take **types** as parameter and **return types** 
  * To construct new types, we apply **type constructors** on type expressions (e.g. monomorphic types or type variables).

=== The List type constructor ===

In our previous ''data List a = ...'' definition:
  * ''List'' is a **type constructor**
  * Since - conceptually, ''List'' is also a function, it must have a **type**. The **type** of a **type constructor** is called **kind** in Haskell. Thus, the kind of ''List'' is written as: ''List :: * => *'', which reads: //List receives a type and returns a type//
  * The polymorphic type ''List a'' is actually a **type function application**. The function is ''List'' and the parameter is the variable ''a''.
  * Similarly, the monomorphic type ''List Integer'' (or similarly ''[Integer]'') is constructed as an application of ''List'' on the monomporphic type ''Integer''.

**Exercise**: Describe the construction of the following types. For what do they stand?
  * ''[(List a)]''
  * ''(List [a])''

=== The Pair type constructor ===

  * ''Pair :: * => * => *'' is a **type constructor** with kind ''* => * => *''. It takes two types and produces a type.
  * The type ''Pair a Integer'' is polymorphic and represents the type of any pair whose second component is an integer.

With this observation, we can improve our syntax for types, as follows:

<code>
<type> ::= <const_type> | <type_var> | <type_constructor_application>
<type_constructor_application> ::= <type_const_1> <type> | <type_const_2> <type> <type> | ...
</code>

where:
  * ''<type_const_1>'' is any type constructor having kind ''* => *''
  * ''<type_const_2>'' is any type constructor having kind ''* => * => *''
  * etc.

To conclude, we observe that **the function type** is also constructed via the application of the type constructor:
  * ''(->) :: * => * => *''
on specific types or type expressions.


==== Ad-hoc polymorphism ====

Ad-hoc polymorphism is necessary in typed functional languages, and we illustrate it via a few examples:
  * to display an object the interpreter is calling the function ''show :: a -> String'' which takes an arbitrary type and converts it to a String. Naturally, ''show'' requires **type-dependent implemementations**
  * similarly, the ''+'' operation has different implementations for Integers, Floats, and may be extended for other objects as well.

=== Towards type-classes ===

Consider the types ''Nat'' and ''List a'' defined in previous lectures. To makes objects of type ''Nat'' or ''List a'' showable, we require functions of signature ''Nat -> String'' and ''List a -> String'', respectively. We define them below:

<code haskell>
showNat :: Nat -> String
showNat =
	let c Zero = 0
	    c (Suc x) = 1 + (c x)
	    in show . c

showList :: (List a) -> String
showList = lfoldr (\x y -> (show x)++":"++y) "[]"
</code>

The implementation of ''showList'' relies on ''lfoldr :: (a -> b -> b) -> b -> List a -> b''. Also, written in Haskell as-is, ''showList'' has problems regarding the call ''(show x)''. Consider that ''x::(List Nat)'' or that ''x :: Nat''. Depending on ''x'''s type, we need to call different show functions. What is obvious already is that **we need a single function name (e.g. show) which should have type-dependent implementations**.

An attempt to solve this issue is by introducing a new type:
<code haskell>
data Showable a = C1 Nat | C2 (List a)

show :: (Showable a) -> String
show (C1 x) = showNat x
show (C2 x) = showList y 
</code> 

An object of type ''Showable a'' indicates a value which can be displayed. For each showable **type**, we define separate construction rules (''C1'' resp. ''C2''). The above code still has a problem:
  * ''showList :: (List a) -> String'', however, to be able to call ''(show x)'', x must be showable, hence ''x::Showable a''

To solve this issue, we modify the signature of ''showList'':
<code haskell>
showList :: (List (Showable a)) -> String
</code>

as well as the definition of ''Showable a'':

<code haskell>
data Showable a = C1 Nat | C2 (List (Showable a))
</code>

Our approach to handling **ad-hoc polymorphism** suffers from a single drawback:
  * it relies on **type-packing**. The programmer needs to handle both user-defined values (e.g. ''Zero :: Nat''), as well as showable values (e.g. ''C1 Zero :: Showable a''); **we have two, type-dependent representations for the same object**.

=== Type-classes ===

To solve the above issue, ad-hoc polymorphism is implemented in Haskell via **type-classes**, which are conceptually different from classes in OOP. In short:
  * **a type-class** describes a collection of types, which can be defined by the user
  * **type-classes** also contain //function signatures// which are traits specific to each type in the type-class
  * **types can be enrolled in type-classes** by the user
  * **relationships between type-classes** (e.g. inclusion) can be defined by the user.

We illustrate all the above by introducing the definition for the type-class ''Show'':
<code haskell>
class Show a where
   show :: a -> String
</code>

In the above definition, ''a'' is an arbitrary type which is enrolled in class ''Show''. Any such type supports the function ''show'', which is defined as part of the type-class. The following code enrols our previous ''Nat'' type in class ''Show'', hence making naturals showable:

<code haskell>
instance Show Nat where
    show = 
      let convert Zero = 0
          convert (Succ n) = 1 + (convert n)
      in show . convert
</code>

In our implementation, the type of ''show . convert'' is ''Nat -> String'', since ''convert :: Nat -> Integer''. This example also shows ad-hoc polymorphism in action. In the above expression (the functional composition), the general type ''show :: (Show a) => a -> String'' of ''show'' becomes via unification ''::Integer -> String''. Thus, the compiler knows to call the integer implementation of ''show'', which is part of Prelude.

Also, note the interpretation of the type signature: ''show :: (Show a) => a -> String'' which tells us that:
  * ''show :: a -> String'' where ''a'' must be enrolled in the type-class ''Show''

We continue by enrolling ''List a'' in class ''Show''. Recall that, to be able to show lists, the elements from the list need to be showable. Hence, the enrollment is:

<code haskell>
instance (Show a) => Show (List a) where
    show Void = "[]"
    show (Cons h t) = (show h)++":"++(show t)
</code>

Finally, we illustrate another kind of enrollment. It spawns from the observation that both lists and trees support map operations, which have very similar behaviour:

<code haskell>
lmap :: (a -> b) -> (List a) -> (List b)
tmap :: (a -> b) -> (Tree a) -> (Tree b)
</code>

We can define the **class of mappable types**, which contains a ''fmap'' operation. In Haskell, this class is called ''Functor''. A tentative definition ''class Functor t where'' raises the question regarding who is ''t'', such that all constraints from the map signatures are preserved:
  * map transforms //containers of a kind// (e.g. Lists) in //containers// of the **same kind**
  * if the mapped transformation is ''(a -> b)'', then the first container must have elements of type ''a'' while the second - of type ''b''.

The solution is:
<code haskell>
class Functor t where
   fmap :: (a -> b) -> t a -> t b
</code>
where ''t'' is a type-constructor with kind ''t :: * => *''. Thus, we have the following enrollments:

<code haskell>
instance Functor List where
  fmap = ...

instance Functor Tree where
  fmap = ...
</code>