====== Lazy Evaluation in Haskell ======
==== Introduction =====
**Lazy evaluation** means that:
- an expression (function application) will be evaluated **only when it is needed** (precisely as in the Lambda Calculus's normal evaluation)
- an expression is evaluated **only once**
We illustrate point 1. via the following example:
nats = 0:(map (+1) nats)
test = foldr (\x y-> if x > 2 then 0 else x+y) 10 nats
* First, note that ''nats'' is a recursive non-terminating expression, which will produce the list of natural numbers, until memory is depleted. To examine this, it is sufficient to call ''nats'' in the interpreter
* Second, ''test'' is an expression which evaluates to ''3'', **although it relies on ''nats''** for the computation:
* let ''op = \x y-> if x > 2 then 0 else x+y''. Then, in effect, ''test'' attempts to compute the expression:
0 op (1 op (2 op (3 op (4 op ....
* however, in the expression ''(3 `op` (4 `op` .... '' the value of the second operand is not used (since ''x>3''), hence ''(4 `op` .... '' is not evaluated. The result is 0. Thus, ''test'' actually computes
0 op (1 op (2 op 0))
* note that the value of the accumulator (''10'') is not actually used.
To illustrate point 2. consider:
evens = zipWith (+) nats nats
some = take 2 evens
We also recall that:
take 0 _ = []
take n (h:t) = h:(take (n-1) t)
zipWith op (x:xs) (y:ys) = (op x y):(zipWith xs ys)
zipWith _ _ _ = []
No expression is evaluated until we call ''some'', in the interpreter. Thus, we start with the following un-evaluated expressions:
^ Variable ^ Expression ^ Value ^
| evens | ? | unevaluated |
| some | ? | unevaluated |
| nats | ? | unevaluated |
Upon calling ''some'' we obtain the following result which requires us to evaluate ''evens''. This happens due to the pattern-matching definition of ''take'', which requires a value of the form ''(x:xs)''.
| evens | ? | unevaluated |
| some | ''take 2 evens'' | unevaluated |
| nats | ? | unevaluated |
To evaluate ''evens'', as before, the pattern-matching definition of ''zipWith'' requires the first element of ''nats'':
| evens | ''zipWith (+) nats nats'' | unevaluated |
| some | ''take 2 evens'' | unevaluated |
| nats | ? | unevaluated |
Note that we only require ''x'' and ''y'' to evaluate zipWith in one step, hence ''(map (+1) nats)'' is not (yet) evaluated.
| evens | ''zipWith (+) nats nats'' | unevaluated |
| some | ''take 2 evens'' | unevaluated |
| nats | ''0:(map (+1) nats)'' | unevaluated |
Thus, the evaluation of ''zipWith'' yields:
| evens | ''(0+0):zipWith (+) xs ys'' | unevaluated |
| some | ''take 2 evens'' | unevaluated |
| nats | ''0:(map (+1) nats)'' | unevaluated |
Although we have created additional column in the table, we stress that **the expressions ''(map (+1) nats)''** which appear in the body of ''nats'', ''xs'' and ''ys'' **are actually the same**, and not different identical expressions. The first-step evaluation of ''take 2 evens'' is now complete:
| evens | ''(0+0):zipWith (+) xs ys'' | unevaluated |
| some | ''0:(take 1 t)'' | unevaluated |
| nats | ''0:(map (+1) nats)'' | unevaluated |
| xs,ys | ''(map (+1) nats)'' | unevaluated |
| t | ''zipWith (+) xs ys'' | unevaluated |
As before, note that both occurrences of ''zipWith (+) xs ys'' are **actually the same expression**. We continue with another step in the evaluation of ''take'', which leads to evaluating ''zipWith (+) xs ys'', and subsequently, ''xs'' and ''ys'':
| evens | ''(0+0):zipWith (+) xs ys'' | unevaluated |
| some | ''0:(take 1 t)'' | unevaluated |
| nats | ''0:1:(map (+1) nats)'' | unevaluated |
| xs,ys | ''1:(map (+1) nats)'' | unevaluated |
| t | ''zipWith (+) xs ys'' | unevaluated |
Note that after evaluating the expression ''(map (+1) nats)'' in one step, **the expression is not re-evaluated**, as shown in the table above.
| evens | ''(0+0):zipWith (+) xs ys'' | unevaluated |
| some | ''0:(take 1 t)'' | unevaluated |
| nats | ''0:1:(map (+1) nats)'' | unevaluated |
| xs,ys | ''1:(map (+1) nats)'' | unevaluated |
| t | ''(1+1):zipWith (+) xs' ys''' | unevaluated |
We have now finished the second step in the evaluation of ''take''. We omit adding variables ''xs', ys''' and ''t'''.
| evens | ''(0+0):zipWith (+) xs ys'' | unevaluated |
| some | ''0:2:(take 0 t')'' | unevaluated |
| nats | ''0:1:(map (+1) nats)'' | unevaluated |
| xs,ys | ''1:(map (+1) nats)'' | unevaluated |
| t | ''(1+1):zipWith (+) xs' ys''' | unevaluated |
Now, ''take 0 t''' evaluates to ''[]'', hence we finally get:
| evens | ''(0+0):zipWith (+) xs ys'' | unevaluated |
| some | ''0:2:[]'' | **evaluated** |
| nats | ''0:1:(map (+1) nats)'' | unevaluated |
| xs,ys | ''1:(map (+1) nats)'' | unevaluated |
| t | ''(1+1):zipWith (+) xs' ys''' | unevaluated |
and the evaluation stops.
==== Applications of normal evaluation =====
==== Dynamic programming - edit distance ====
Consider two strings ''s1'' and ''s2''. We define the //edit distance// between ''s1'' and ''s2'' as the **minimal number of edit operations** which make the strings **identical**. The allowed //edit operations// are:
* character insertion (e.g. ''text'' and ''ext'' have edit distance 1)
* character deletion (e.g. ''tet'' and ''text'' have edit distance 1)
* character modification (e.g. ''text'' and ''tent'' have edit distance 1)
As an example, the strings ''maple'' and ''apple'' have edit distance 2:
* we delete the first symbol from ''maple'' and obtain ''aple''
* we insert the symbol ''p'' at the second position in ''aple'' and obtain ''apple''
** Dynamic programming ** computes the edit distance between two strings by building a matrix ''d'' having ''size(s1)+1'' lines and ''size(s2)+1'' columns, where ''d[i][j]'' represents the edit distance between the substrings ''s1[0:i]'' and ''s2[0:j]'':
* ''d[i][0] = i'' for all lines (the distance between the empty string and the (sub)-string ''s1[0:i]'' is ''i'')
* ''d[0][i] = i'' for all columns (the distance between the empty string and the (sub)-string ''s2[0:i]'' is ''i'')
* if ''s1[i]'' and ''s2[i]'' coincide, then ''d[i][j] = d[i-1][j-1]''
* otherwise, ''d[i][j]'' is computed by applying **the edit operation which minimises distance**. Concretely, ''d[i][j]'' is the minimal of ''d[i-1][j] + 1'' (delete character ''i'' from ''s1''), ''d[i-1][j-1] + 1'' (modify character ''i'' from ''s1''), ''d[i][j-1] + 1'' (insert character ''i'' in ''s1'')
==== Functional implementation ====
Dynamic programming (for edit distance) can be efficiently implemented in Haskell, by exploiting lazyness **in order to compute only those necessary distances**, and **only once**. We first start with an example of an implementation of the infinite list of Fibonacci numbers:
fibo = 0:1:(zipWith (+) fibo (tail fibo))
====== References ======
* [[ http://jelv.is/blog/Lazy-Dynamic-Programming/ | Lazy dynamic programming ]]