====== Lazy Evaluation in Haskell ====== ==== Introduction ===== **Lazy evaluation** means that: - an expression (function application) will be evaluated **only when it is needed** (precisely as in the Lambda Calculus's normal evaluation) - an expression is evaluated **only once** We illustrate point 1. via the following example: nats = 0:(map (+1) nats) test = foldr (\x y-> if x > 2 then 0 else x+y) 10 nats * First, note that ''nats'' is a recursive non-terminating expression, which will produce the list of natural numbers, until memory is depleted. To examine this, it is sufficient to call ''nats'' in the interpreter * Second, ''test'' is an expression which evaluates to ''3'', **although it relies on ''nats''** for the computation: * let ''op = \x y-> if x > 2 then 0 else x+y''. Then, in effect, ''test'' attempts to compute the expression: 0 op (1 op (2 op (3 op (4 op .... * however, in the expression ''(3 `op` (4 `op` .... '' the value of the second operand is not used (since ''x>3''), hence ''(4 `op` .... '' is not evaluated. The result is 0. Thus, ''test'' actually computes 0 op (1 op (2 op 0)) * note that the value of the accumulator (''10'') is not actually used. To illustrate point 2. consider: evens = zipWith (+) nats nats some = take 2 evens We also recall that: take 0 _ = [] take n (h:t) = h:(take (n-1) t) zipWith op (x:xs) (y:ys) = (op x y):(zipWith xs ys) zipWith _ _ _ = [] No expression is evaluated until we call ''some'', in the interpreter. Thus, we start with the following un-evaluated expressions: ^ Variable ^ Expression ^ Value ^ | evens | ? | unevaluated | | some | ? | unevaluated | | nats | ? | unevaluated | Upon calling ''some'' we obtain the following result which requires us to evaluate ''evens''. This happens due to the pattern-matching definition of ''take'', which requires a value of the form ''(x:xs)''. | evens | ? | unevaluated | | some | ''take 2 evens'' | unevaluated | | nats | ? | unevaluated | To evaluate ''evens'', as before, the pattern-matching definition of ''zipWith'' requires the first element of ''nats'': | evens | ''zipWith (+) nats nats'' | unevaluated | | some | ''take 2 evens'' | unevaluated | | nats | ? | unevaluated | Note that we only require ''x'' and ''y'' to evaluate zipWith in one step, hence ''(map (+1) nats)'' is not (yet) evaluated. | evens | ''zipWith (+) nats nats'' | unevaluated | | some | ''take 2 evens'' | unevaluated | | nats | ''0:(map (+1) nats)'' | unevaluated | Thus, the evaluation of ''zipWith'' yields: | evens | ''(0+0):zipWith (+) xs ys'' | unevaluated | | some | ''take 2 evens'' | unevaluated | | nats | ''0:(map (+1) nats)'' | unevaluated | Although we have created additional column in the table, we stress that **the expressions ''(map (+1) nats)''** which appear in the body of ''nats'', ''xs'' and ''ys'' **are actually the same**, and not different identical expressions. The first-step evaluation of ''take 2 evens'' is now complete: | evens | ''(0+0):zipWith (+) xs ys'' | unevaluated | | some | ''0:(take 1 t)'' | unevaluated | | nats | ''0:(map (+1) nats)'' | unevaluated | | xs,ys | ''(map (+1) nats)'' | unevaluated | | t | ''zipWith (+) xs ys'' | unevaluated | As before, note that both occurrences of ''zipWith (+) xs ys'' are **actually the same expression**. We continue with another step in the evaluation of ''take'', which leads to evaluating ''zipWith (+) xs ys'', and subsequently, ''xs'' and ''ys'': | evens | ''(0+0):zipWith (+) xs ys'' | unevaluated | | some | ''0:(take 1 t)'' | unevaluated | | nats | ''0:1:(map (+1) nats)'' | unevaluated | | xs,ys | ''1:(map (+1) nats)'' | unevaluated | | t | ''zipWith (+) xs ys'' | unevaluated | Note that after evaluating the expression ''(map (+1) nats)'' in one step, **the expression is not re-evaluated**, as shown in the table above. | evens | ''(0+0):zipWith (+) xs ys'' | unevaluated | | some | ''0:(take 1 t)'' | unevaluated | | nats | ''0:1:(map (+1) nats)'' | unevaluated | | xs,ys | ''1:(map (+1) nats)'' | unevaluated | | t | ''(1+1):zipWith (+) xs' ys''' | unevaluated | We have now finished the second step in the evaluation of ''take''. We omit adding variables ''xs', ys''' and ''t'''. | evens | ''(0+0):zipWith (+) xs ys'' | unevaluated | | some | ''0:2:(take 0 t')'' | unevaluated | | nats | ''0:1:(map (+1) nats)'' | unevaluated | | xs,ys | ''1:(map (+1) nats)'' | unevaluated | | t | ''(1+1):zipWith (+) xs' ys''' | unevaluated | Now, ''take 0 t''' evaluates to ''[]'', hence we finally get: | evens | ''(0+0):zipWith (+) xs ys'' | unevaluated | | some | ''0:2:[]'' | **evaluated** | | nats | ''0:1:(map (+1) nats)'' | unevaluated | | xs,ys | ''1:(map (+1) nats)'' | unevaluated | | t | ''(1+1):zipWith (+) xs' ys''' | unevaluated | and the evaluation stops. ==== Applications of normal evaluation ===== ==== Dynamic programming - edit distance ==== Consider two strings ''s1'' and ''s2''. We define the //edit distance// between ''s1'' and ''s2'' as the **minimal number of edit operations** which make the strings **identical**. The allowed //edit operations// are: * character insertion (e.g. ''text'' and ''ext'' have edit distance 1) * character deletion (e.g. ''tet'' and ''text'' have edit distance 1) * character modification (e.g. ''text'' and ''tent'' have edit distance 1) As an example, the strings ''maple'' and ''apple'' have edit distance 2: * we delete the first symbol from ''maple'' and obtain ''aple'' * we insert the symbol ''p'' at the second position in ''aple'' and obtain ''apple'' ** Dynamic programming ** computes the edit distance between two strings by building a matrix ''d'' having ''size(s1)+1'' lines and ''size(s2)+1'' columns, where ''d[i][j]'' represents the edit distance between the substrings ''s1[0:i]'' and ''s2[0:j]'': * ''d[i][0] = i'' for all lines (the distance between the empty string and the (sub)-string ''s1[0:i]'' is ''i'') * ''d[0][i] = i'' for all columns (the distance between the empty string and the (sub)-string ''s2[0:i]'' is ''i'') * if ''s1[i]'' and ''s2[i]'' coincide, then ''d[i][j] = d[i-1][j-1]'' * otherwise, ''d[i][j]'' is computed by applying **the edit operation which minimises distance**. Concretely, ''d[i][j]'' is the minimal of ''d[i-1][j] + 1'' (delete character ''i'' from ''s1''), ''d[i-1][j-1] + 1'' (modify character ''i'' from ''s1''), ''d[i][j-1] + 1'' (insert character ''i'' in ''s1'') ==== Functional implementation ==== Dynamic programming (for edit distance) can be efficiently implemented in Haskell, by exploiting lazyness **in order to compute only those necessary distances**, and **only once**. We first start with an example of an implementation of the infinite list of Fibonacci numbers: fibo = 0:1:(zipWith (+) fibo (tail fibo)) ====== References ====== * [[ http://jelv.is/blog/Lazy-Dynamic-Programming/ | Lazy dynamic programming ]]