Abstract Datatypes (ADT)

In mathematics, algebraic structures consist of a (carrier) set (say - the natural numbers) as well as operations on elements of the set, satisfying specific axioms (e.g. commutativity for addition).

Very similar to algebraic structures, an abstract data type consists of:

  • specification of a class of objects,
  • a set of operations which operate on such objects,
  • the behaviours of such operations

In what follows, we define the ADT for lists and use it to exemplify the main ingredients of an ADT construction. The intuition (and subtlety) underlying ADTs is related to the duality between abstract and concrete representation.

This duality can also be found in logic. Consider the following formula:

$ \lnot p \wedge q $

The formula combines the (abstract) propositions into a logical (abstract) statement. We can know the truth value of this statement only under a (concrete) interpretation of the propositions $ p$ and $ q$ .

However, what makes the formula abstract is that its truth value is unconditioned by the meaning assigned to $ p$ and $ q$ . Thus:

  • $ p$ may refer to it rains and $ q$ - to the ground is wet or
  • $ p$ may refer to the monitor is unplugged and $ q$ - to the monitor is off

Following this example, we find that:

  • an abstract datatype is similar to a formula - it specifies a behaviour;
  • a type implementation is similar to a meaning of a formula;

We will make this intuition more precise in what follows.

Sorts

A sort is a symbol which is interpreted as a set. As illustrated above, sorts are abstract, while their interpretations, i.e. sets - are concrete.

We will produce a definition of the ADT list which is parametric with respect to the elements contained by the list. Hence, we require two sorts:

  • List - whose interpretation is the set of all possible lists
  • E - whose interpretation is the set of all elements contained in a list

Base constructors

The base constructors for our ADT are:

$ Void : List$

$ Cons : E \times List \rightarrow List$

$ Void$ is the abstract representation of the empty list, while $ Cons$ is a operator which takes an element and a list and constructs a new list. This pair of constructors has the following properties:

  • each possible list can be abstractly represented as a sequence of base constructor applications. For instance, the list [1,2,3] can indeed be represented as: $ Cons(1,Cons(2,Cons(3,Void)))$ .
  • each list representation uniquely refers to a specific list. For instance $ Cons(2,Cons(2,Void))$ can only be interpreted as the list [2,2]

In general, the properties of base constructors are more interesting and receive a more elaborate study which is outside the scope of this lecture. In the general case, deciding that a set of constructors are base constructors is non-trivial, however, for our examples it is straightforward. We do not elaborate on this otherwise interesting aspect.

ADT operators

The operations which we perform on lists are:

$ isEmpty : List \rightarrow \mathbb{B}$

$ size : List \rightarrow \mathbb{N}$

$ head : List \rightarrow E$

$ tail : List \rightarrow List$

$ append : List \times List \rightarrow List$

$ reverse : List \rightarrow List$

Their definitions are self-explanatory. We simply observe that we introduce and utilise the sorts of $ \mathbb{B}$ $ \mathbb{N}$ which are interpreted at the set of booleans and natural numbers respectively, with their corresponding standard operations.

Axioms

The operations defined above need additional specification to ensure that they behave in the desired way. Such a specification should express list manipulations/transformation in terms of the base constructors. For instance:

  • (H) $ head(Cons(e,l)) = e$
  • (T) $ tail(Cons(e,l)) = l$

(H) and (T) are axioms, and describe the intended behaviour of the operations. Recall our two implementations LinkedList and ArrayList together with the functions Cons, head, tail. Note that the functions head and tail for both list implementations satisfy the above axioms.

We continue with:

  • (E1) $ isEmpty(Void) = true$
  • (E2) $ isEmpty(Cons(e,l)) = false$
  • (S1) $ size(Void) = 0$
  • (S2) $ size(Cons(e,l)) = 1 + size(l)$

Next, we move on to specify concatenation. We find that several axioms make sense:

  • (1) $ append(l1,Void) = l1$
  • (2) $ append(Void,l2) = l2$
  • (3) $ append(l1,append(l2,l3)) = append(append(l1,l2),l3)$
  • (4) $ size(append(l1,l2)) = size(l1) + size(l2)$
  • (5) $ append(Cons(e,l1),l2) = Cons(e,append(l1,l2))$

While each of the above axioms describes sensible concatenation behaviour, the interesting questions are:

  • are the axioms sufficient in order to describe correct concatenation behaviour?
  • are all axioms necessary?

The first question is of rather philosophical nature, and depends on the specifier (programmer) intention. We shall not comment on that, however it is possible (using instruments from Category Theory) to study the question more formally and in more detail.

The answer to the second question is: no. As it turns out, it is sufficient to choose as axioms:

  • (A1) $ append(Void,l2) = l2$
  • (A2) $ append(Cons(e,l1),l2) = Cons(e,append(l1,l2))$

We shall see in the next lecture that, by taking these axioms for concatenation, we get (1),(3),(4) for free - they are direct consequences of the axioms.

There are several hints to why (A1) and (A2) are good axiom choices:

  • they specify the behaviour of $ append$ only in terms of base constructors
  • they are de-constructive in the sense that they specify the concatenation of a larger list in terms of a smaller list

Which of the following are appropriate axioms for reversal?

  • (1) $ reverse(Void) = Void$
  • (2) $ reverse(reverse(l)) = l$
  • (3) $ reverse(append(l1,l2)) = append(reverse(l2),reverse(l1))$
  • (4) $ reverse(Cons(e,l)) = append(reverse(l),Cons(e,Void))$
  • (5) $ size(l) = size(reverse(l))$
  • (6) $ reverse(l) = append(reverse(tail(l),head(l)))$

Conclusion: The ADT List

An abstract datatype consists of:

  • sorts for all utilised elements; Sorts behave similarly to types in programming languages
  • base constructors
  • operators
  • operator axioms

A concrete datatype which implements and ADT must:

  • implement base constructors
  • implement operators
  • ensure that all operator axioms are satisfied by the implementation.

It is sometimes the case that axioms can also serve the role of implementations by themselves. Recall our LinkedList implementation of the ADT list. We have (at least) two possible ways to implement concatenation:

List append(List l1, List l2) {
   if (isEmpty(l1))
      return l2;
   List l = l1;
   while (!isEmpty(l1->next)){
      l1 = l1->next;
   }
   l1->next = l2;
   return l;
}

as well as:

List append(List l1, List l2) {
   if (isEmpty(l1))
      return l2;
   return cons(head(l1),append(tail(l1),l2));
}
  • The first concatenation relies on implementation details to ensure efficiency. We can easily show via an inductive argument that the implementation satisfies the axioms.
  • The second concatenation relies on the axioms themselves which are transcribed in C. It is less efficient (since it relies on recursion), however it is an implementation which is independent on the list implementation.

The observation here is that Abstract Datatypes can serve as a programming model just like, for instance, Object-Oriented Programming.

The data type FIFO (First-In First-Out) is given by the following base constructors:

  • $ Empty: FIFO$
  • $ Enqueue : E \times FIFO \rightarrow FIFO$

and operators:

  • $ Dequeue : FIFO \rightarrow FIFO$
  • $ Top : FIFO \rightarrow E$

we omit other basic operators which are similar to those for lists, and focus our attention on defining axioms for $ Dequeue$ : the first inserted element must be removed:

  • (D1) $ Dequeue(Enqueue(e,Empty)) = Empty$
  • (T1) $ Top(Enqueue(e,Empty)) = e$

attempting a dequeue operation on a FIFO of size larger than 1 will not affect the element which was introduced last:

  • (D2) $ Dequeue(Enqueue(e,Enqueue(e',l))) = Enqueue(e,Dequeue(Enqueue(e',l)))$
  • (T2) $ Top(Enqueue(e,Enqueue(e',l))) = Top (Enqueue(e',l))$

There are multiple possible FIFO implementations, but here we focus on one which is itself abstract. It relies on seeing a FIFO as two lists denoted $ \langle l,r\rangle$ :

  • inserting a new element in the FIFO amounts to adding an element into $ l$
  • removing a new element from the FIFO amounts to removing an element from $ r$

For brevity, we shall use the following notational conventions:

  • $ Cons(e,l) \equiv e:l$
  • $ Append(l1,l2) \equiv l1 ++ l2$

Let us consider the FIFO represented by: $ \langle 1:2:Void, 9:8:Void\rangle$ . A $ Enqueue$ of $ 0$ will produce the FIFO $ \langle 0:1:2:Void, 9:8:Void\rangle$ , while a $ Dequeue$ will produce $ \langle 1:2:Void, 8:Void\rangle$ . Finally, the elements of the FIFO, in the order in which they were introduced, are: $ 1,2,8,9$ .

The FIFO implementation has two issues which require attention:

  • a pair of lists is not a unique FIFO representation. For instance, $ Enqueue(1,Enqueue(2,Empty))$ can be interpreted by: $ \langle 1:2:Void, Void \rangle$ as well as by $ \langle 1:Void, 2:Void \rangle$ . Thus, FIFO equality in the implementation requires the operation: $ \langle l, r \rangle \equiv \langle l', r' \rangle =^{(def)} l ++ reverse(r) = l' ++ reverse(r') $
  • A FIFO such as $ \langle 1:2:Void, Void \rangle$ is nonempty, however we cannot extract elements from the rightmost list, since it is empty. Hence, we require the operation:
    • (N) $ normalize(\langle l, Void \rangle) = \langle Void, reverse(l) \rangle$

The normalization procedure ensures that - if the FIFO is non-empty, there are still elements to be dequeued from the rightmost list. Note that the axiom is specified only for non-normalized FIFOs, which is a matter of convenience for our implementation proofs.

Implementations

In the implementation, we will assume the following invariant: all operations receive normalised FIFOs and return normalised FIFOs.

  • $ Enqueue(e,\langle Void, Void\rangle) = \langle Void, e:Void \rangle$
  • $ Enqueue(e,\langle l, e':r\rangle) = \langle e:l, e':r\rangle$
  • $ Dequeue(\langle l, e:Void\rangle) = normalize(\langle l, Void \rangle)$
  • $ Dequeue(\langle l, e:e':r\rangle) = \langle l, e':r\rangle$
  • $ Top(\langle l, e:r\rangle) = e$

Axioms are preserved

We proceed to verify that the axioms for $ Dequeue$ are preserved by the implementation.

ADTs offer an interface between object implementation and object behaviour. The FIFO example illustrates the power of ADTs:

  • A FIFO can be implemented as two lists, irrespective of the list implementation choice
  • Once axioms are shown to hold in the implementation, they are guaranteed to satisfy all properties of the ADT.