Recall our FIFO implementation relying on two lists, from the previous lecture. This implementation has several advantages:
Let us return to the first point above, regarding the cost of removing an element is not constant in the general case: if the right list contains only one element, removal will trigger the copying of elements from the left list to the right list, totalling a cost of $ \Theta(n)$ where $ n$ is the size of the FIFO.
A worst-case analysis yields the cost of removal to be $ O(n)$ . In this lecture, we show that this analysis is not precise, and not fair to the FIFO implementation. We start with a few observations:
$ S = op_1, \ldots, op_n $
of $ n$ operations. We consider that each $ op_i$ can be any of: $ enqueue$ , $ dequeue$ or $ top$ . In what follows, we study three methods for determining:
We observe that:
$ cost(S)= cost(ins_l) + cost(del_l) + cost(ins_r) + cost(del_r)$
where $ ins_x$ and $ del_x$ are the costs for all insertions resp. removals from the list $ x$ . Also, we assume each individual insertion/deletion from a list has cost 1.
Hence:
$ cost(S) \leq n + n + n + n = 4n$
The average cost per operation is $ \frac{cost(S)}{n}$ , hence it is at most $ 4$ . This analysis shows that we can safely assume each $ enqueue$ or $ dequeue$ operation has individual cost $ 4$ , which is on average an upper bound on the real cost.
Therefore, in our algorithm which employs a FIFO, the average cost per FIFO operation is constant.
Remarks: The aggregate method generally tries to find an (asymptotically) tight bound on the cost of a sequence of operations, by aggregating costs. In our example, aggregation meant estimating the number of operations on lists $ l$ and $ r$ , instead of explicitly counting element insertions, moves and removals.
As before, suppose $ S$ contains only $ enqueue$ and $ dequeue$ . In the banking method we imagine that our data-structure: the FIFO, is a bank. We (over) estimate the cost of each operation type, in such a way that:
We call this estimated cost ammortised cost (usually denoted as $ \hat{c}$ ).
The golden rule of the banking method is that: no expensive operation can take more credit than the bank has available.
For the FIFO, we estimate:
To validate this estimation, we verify the golden rule.
Let $ e_i = \hat{c_i} - c_i$ where $ \hat{c_i}$ is the ammortised cost of the ith operation, and $ c_i$ is the real cost.
The golden rule is formally expressed as follows:
$ \displaystyle \forall S: \sum_{ith\;op\;in\;S} e_i \geq 0$
The quantification $ \forall S$ means: after any sequence of operations, while the sum captures the total credit from the bank at the end of executing sequence $ S$ .
The rule is generally presented in the form:
$ \displaystyle \forall S: \sum_{ith\;op\;in\;S} \hat{c}_i \geq \sum_{ith\;op\;in\;S} c_i$
which states that: the sum of the ammortised costs in any sequence of operations must be an upper limit on the sum of real costs. We verify this inequality for our estimation of FIFO ammortised costs. We need to check that:
$ 3*\#enq + \#deq \geq cost(ins_l) + cost(copy) + cost(del_r)$
where $ \#enq$ (resp. $ \#deq$ ) are the number of $ enqueue$ (resp. $ dequeue$ ) operations. Similar to the aggregate method, we have reformulated the real cost in terms of list operations. As before, we observe that:
which concludes our analysis.
Remarks:
The potential method is conceptually similar to the banking method, however, instead of estimating ammortised cost, we estimate a potential function which models how the credit from the bank changes. More precisely:
The golden rule of the potential method, is that the difference of potential from the initial and any current state of a data-structure can never be negative.
Let $ S = op_1, \ldots, op_n$ and denote by $ F_i$ the state (contents) of the FIFO after the $ ith$ operation. Also, denote by $ \Phi(F_i)$ the potential of the FIFO after the $ ith$ operation. $ \Phi(F_0)$ is the potential of the FIFO in the initial state. The golden rule is expressed as:
$ \forall S: \Phi(F_n) - \Phi(F_0) \geq 0$
We estimate the potential function to be:
$ \Phi(F_n) = 2 * size(l)$
where $ size(l)$ is the size of the left list. The golden rule is easily verified in this particular case, since $ \Phi(F_0)$ is $ 0$ and $ \Phi$ is always positive.
Having found the potential function, we can determine the ammortised cost via the following general formula:
$ \hat{c_i} = c_i + \Phi(F_i) - \Phi(F_{i-1})$
which states that the ammortised cost of an operation is the real cost together with the difference in potential between the $ i-1$ and $ i$ th operations (the latter being positive or negative).
Hence:
$ \hat{c}_{enq} = 1 + 2*size(l_i) - 2*size(l_{i-1}) = 1 + 2 = 3$
where $ size(l_i)$ is the size of the left list after the $ ith$ operation.
For dequeue, we consider two cases:
note that during dequeueing, the size of the left list does not change.
Remarks:
Consider the array list implementation illustrated in the previous lectures. The cost of an insert (cons) operation is $ \Theta(n)$ if the capacity of the array holding the list is full, where $ n$ is the number of elements in the array.
We analyse the cost of a sequence of $ ins$ operations performed on an array list. Let $ size(L)$ denote the capacity in the holding array, and $ elems(L)$ denote the number of elements inserted in the array list.
We recall that:
Let $ S$ be a sequence of $ ins$ operations. We aggregate the actual insertion vs copy costs. This is illustrated in the table below, for a sequence of 9 operations:
Operation no | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|
Total cost | 1 | 2 | 3 | 1 | 5 | 1 | 1 | 1 | 9 |
Copy cost | 0 | 1 | 2 | 0 | 4 | 0 | 0 | 0 | 8 |
Ins cost | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
In the general case: $ cost(S) = ins\_cost(S) + copy\_cost(S)$
To compute $ copy\_cost(S)$ , we observe that, if $ k$ is the number of copy operations after a sequence of $ n$ operations, then:
$ 2^{k-1} < n \leq 2^{k}$ hence: $ k-1 < log (n) \leq k$ and thus $ k = \lceil\log{n}\rceil$ is the number of copy operations.
$ \displaystyle cost(S) = n + \sum_{i=0}^{\lceil\log{n}\rceil - 1} 2^i \leq n + \sum_{i=0}^{\log{n}} 2^i = 3n - 1$
Thus, the average cost per operation is constant: $ \frac{cost(S)}{n} = \frac{\Theta(n)}{n} = \Theta(1)$
We estimate:
We illustrate this choice via an example. Suppose we have a half-full array, and the current credit is zero:
credit = 0:
* | * |
---|
after an insertion, 1 was payed for it, and credit = 2:
* | * | * |
---|
after another insertion, credit = 4:
* | * | * | * |
---|
Now the array is full, and we have enough credit to pay for the copy of all elements. After another insertion, credit = 0, and the array becomes again half-full:
* | * | * | * |
---|
We verify the golden rule of the banking method:
$ \displaystyle \forall S: \sum_{ith\;op\;in\;S} \hat{c}_i \geq \sum_{ith\;op\;in\;S} c_i$
which yields:
$ \displaystyle \forall S: 3n \geq \sum_{ith\;op\;in\;S} c_i$
which has already been verified via the banking method.
We fix $ \Phi(L_0) = 0$ and $ \Phi(L_i) = 2 * elems(L_i) - size(L_i)$ .
We observe that $ elems(L_i) \geq size(L_i)/2$ since we can never have fewer elements that half the capacity of the array. Thus, the golden rule of the potential method:
$ \forall S: \Phi(F_n) - \Phi(F_0) \geq 0$
is immediately verified.
To compute the ammortised cost, we observe two cases:
Incidentally, we have identified precisely the same ammortised cost as in the previous method.