Abstract Datatypes
Intro
An Abstract Datatype relies on functions to describe the possible values of a type. We start with a simplistic example:
data Nat = Zero | Succ Nat
- the expression
data Nat
introduces a new type in the programming language - after
=
, the base constructors of the type follow. In Haskell, all constructors must begin with a capital letter Zero
is a nullary constructor.Succ Nat
designates an internal constructor, which expects a natural number (of typeNat
). A value(Succ x)
is of typeNat
(i.e. a natural number), and we may be tempted to see it as a function call, which returns the successor ofx
Zero
and Succ
are called data constructors in Haskell. Nat
is called a type
or type-constructor
. We shall distinguish between the two in a later lecture.
Note that the internal representation of an ADT, as perceived by the programmer, is abstract. We may see values as calls of special functions - data constructors. Except for their meaning (and language-level implementation), data constructors behave exactly as functions. For instance:
Zero :: Nat
Succ :: Nat → Nat
We continue the example with addition:
add :: Nat -> Nat -> Nat add Zero y = y add (Succ x) y = Succ (add x y)
An important observation is that the pattern matching mechanism in Haskell relies on data constructors, and their applications. For instance, the following definition is a correct usage of the pattern matching mechanism:
f (1:y:[]) = ...
I uses the data constructors (:)
and []
for lists, as well as the data constructor 1
for integers. The pattern describes any list of integers which starts with a 1
and contains exactly two elements.
Monomorphic List implementation
In what follows, we give an implementation for the type List of integers. This type is called monomorphic, since our list can only contain elements of a single type (integer):
data IList = Void | Cons Integer IList app :: IList -> IList -> IList app Void l = l app (Cons h t) l = Cons h (app t l) convert :: IList -> [Integer] convert Void = [] convert (Cons h t) = h : (convert t) mfoldl :: (b -> Integer -> b) -> b -> IList -> b mfoldl op acc Void = acc mfoldl op acc (Cons h t) = mfoldl op (op acc h) t mfoldr :: (Integer -> a -> a) -> a -> IList -> a mfoldr op acc Void = acc mfoldr op acc (Cons h t) = op h (mfoldr op acc t) convert2 :: IList -> [Integer] convert2 = mfoldr (:) [] showl :: IList -> String showl = show . convert2
In the above code, we have defined some basic list operations, such as app
(list concatenation) and convert
, which transforms a list of type IList
to a conventional Haskell list (of type [Integer]
).
We have also implemented the two folding procedures for IList
. Note the type signature of each fold. Finally, we have used folds to provide an alternative implementation for convert, as well as for converting lists to strings.
Propositional Logic in Haskell
Abstract Datatypes are a natural way to define more elaborate data-structures, such as propositional formulae. We give a possible definition below:
data Formula = Var String | And Formula Formula | Or Formula Formula | Not Formula
We observe that:
Var :: String → Formula
And :: Formula → Formula → Formula
Or :: Formula → Formula → Formula
Not :: Formula → Formula
A good exercise consists in the implementation of a display function for formulae:
fshow :: Formula -> String fshow (Var v) = v fshow (And f1 f2) = "("++(fshow f1)++" ^ "++(fshow f2)++")" fshow (Or f1 f2) = "("++(fshow f1)++" V "++(fshow f2)++")" fshow (Not f) = "~("++(fshow f)++")"
Next, we implement a function push
, which pushes negation inward:
push :: Formula -> Formula push (Var v) = (Var v) push (Not (Var v)) = Not (Var v) push (Not (And f1 f2)) = Or (push (Not f1)) (push (Not f2)) push (Not (Or f1 f2)) = And (push (Not f1)) (push (Not f2)) push (And f1 f2) = And (push f1) (push f2) push (Or f1 f2) = Or (push f1) (push f2)
Notice, at lines 4,5 the implementation of deMorgan's laws.
Finally, we implement a function which computes the truth-value of a formula, under a certain interpretation. First, we define the type Interpretation
:
type Interpretation = String -> Bool i :: Interpretation i "x" = True i "y" = False i "z" = True
In the first line, we have defined a type-alias: Interpretation
is a function from strings to booleans. We have also implemented a three-variable interpretation, for testing purposes. Next, we implement eval
:
eval :: Interpretation -> Formula -> Bool eval i (Var v) = i v eval i (Not f) = not(eval i f) eval i (And f1 f2) = (eval i f1) && (eval i f2) eval i (Or f1 f2) = (eval i f1) || (eval i f2)
Monomorphic Trees in Haskell
data ITree = Leaf | Node ITree Integer ITree
Note that:
Leaf :: ITree
Node :: ITree → Integer → ITree → ITree
Next, we implement a folding operation on Trees. The key to it is to conceptually define what folding should do on Trees. To grasp an intuition, recall that:
foldr (:) []
is the identity function on lists. Hence,foldr (:) [] [1,2,3]
produces[1,2,3]
.- also, recall that the map operation can be defined as:
\f → foldr ((:).f) []
Similar to the list case, a fold on trees should:
- preserve the tree structure, given the appropriate operator (e.g.
Node
). - hence, the call
mtfold Node Leaf
(whereLeaf
is the accumulator) should be the identity function on trees.
Let us define the identity function tid
for trees:
tid Leaf = Leaf tid (Node r k l) = Node (tid r) k (tid l)
As already said, (tid t)
is equivalent to the call (mtfold Node Leaf t)
, for any arbitrary tree t
.
To obtain the general mtfold
implementation, we simply generalise Node
by an arbitrary operation op
:
- if
Node :: ITree → Integer → ITree → ITree
, then op :: b → Integer → b → b
We also generalise Leaf
by an arbitrary accumulator:
- if
Leaf :: ITree
then acc :: b
The code generalisation becomes:
mtfold :: (b -> Integer -> b -> b) -> b -> ITree -> b mtfold op acc = let f (Node left key right) = f (f left) key (f right) f Leaf = acc in f
Notice that f
is precisely the generalisation of tid
according to our observations. We can use mtfold
to implement various tree operations such as:
tsum = mtfold (\k r l->k + r + l) 0 tmirror = mtfold (\k r l -> Node l k r) Leaf tflatten = mtfold (\k r l -> r ++ [k] ++ l) []