Lazy Evaluation of Haskell

Albert Y. C. Lai, trebla [at] vex [dot] net

Preface: A Note on Semantics

This article is about lazy evaluation, an evaluation strategy, an operational semantics. This means lazy evaluation specifies the steps taken by a computer to do things. Lazy evaluation is a popular way to implement most of Haskell, but not the only way.

Haskell itself is not lazy; it only specifies non-strictness, not what steps a computer takes, for most things. Non-strictness is a word from denotational semantics, which is only about relating expressions to values. Non-strictness means for example

const True ⊥ = True

(⊥ is a theoretical value to stand for “no information” or “no answer”. I cannot say “non-termination” because that would require talking about steps.) In other words, non-strictness means that an expression can have an answer even if some of its arguments have no answer.

Denotational semantics says whether you get an answer and what answer, but does not explain or control the amount of time and memory used to get it. This is when you need the operational semantics given by compilers and interpreters; fortunately, most of them give lazy evaluation, coarsely speaking. Lazy evaluation suffices to explain asymptotic complexities such as why some code uses Θ(n) memory and why an alternative uses Θ(1) memory. Even such a coarse understanding without real cycle counts and byte counts already solves a lot of mysteries because most programmers do not even get this far.

If actual cycle counts and byte counts matter, if you need to explain why some code uses 40 MB memory and why an alternative uses just 10 MB, then lazy evaluation does not suffice, and you need the low-level operational semantics of your compiler or interpreter, including code optimizations. This is outside the scope of this article. But seeing that the low-level operational semantics is a refinement of lazy evaluation, you should probably know lazy evaluation first.

Knowing lazy evaluation does not help explain what answer you get, if that is all you want regardless of complexity. It can be done, and most people do it (out of poor education), but it is too low-level and tedious, missing the forest for the trees, nay, tree leaves. Denotational semantics is better, but still not high-level enough. Axiomatic semantics combined with program derivation is the most suitable, and I have triumphant examples of the axiomatic, derivational style: scanl examples, fix examples, a primitive recursion example.

Whirlwind Tour of Lazy Evaluation

The main points of lazy evaluation are pretty straightforward, just unfamiliar: short-circuit evaluation, and sharing the result (if the thing evaluated has multiple parents). This is of course vague; I illustrate below and fully clarify in subsequent sections.

To illustrate short-circuit evaluation, I evaluate (\c _ -> c) True (3*4+3*4) :

(\c _ -> c) True (3*4+3*4)
{function application}
True

In particular the expression 3*4+3*4 is left alone because the rule for function application doesn't evaluate it.

A longer example, to show what pattern-matching evaluates:

case (\c _ -> c) (Just (3*4+3*4)) 4 of { Just _ -> True; Nothing -> False }
{evaluate until match}
(\c _ -> c) (Just (3*4+3*4)) 4
{function application}
Just (3*4+3*4)
case Just (3*4+3*4) of { Just _ -> True; Nothing -> False }
{match pattern}
True

The sub-expression is evaluated only as far as to tell which pattern to match; 3*4+3*4 is again unevaluated.

But sometimes we are also interested in evaluating 3*4+3*4, for example if we expressedly want it done, or if the pattern above were Just 0. Here is how.

The operators (+) and (*) for built-in numeric types (Int, Integer, Float, Double) are eager, not lazy: to evaluate e+f (say), both e and f are evaluated first. Let us also pick the evaluation order so that we evaluate e, then f, then get back to finish e+f; similarly for e*f. I now evaluate 3*4+3*4 :

3*4+3*4
{first operand not done}
3*4
{both operands done, arithmetic}
12
12+3*4
{second operand not done}
3*4
{both operands done, arithmetic}
12
12+12
{both operands done, arithmetic}
24

Now I illustrate sharing. I evaluate (\x->x+x) (3*4) :

(\x->x+x) (3*4)
{function application}
  +   3*4
{need first operand}
3*4
{both operands done, arithmetic}
12
  +   12
{both operands done, arithmetic}
24

By mentioning the same formal argument x multiple times, lazy evaluation requires sharing the actual argument 3*4 as well as everything that happens to it subsequently, such as it becoming 12. The same sharing applies to bindings in pattern-matching, let, where, and top-level definitions. (For example, let x=3*4 in x+x is almost identical to the above.) However, sharing does not apply to 3*4+3*4 per se, unless the compiler applies an optimization called common-subexpression elimination.

This whirlwind tour gives a glimpse of lazy evaluation but leaves the precise rules unsaid. These are spelled out in subsequent sections.

Notation: Inlined Expression Graph

The notation I use to depict expressions is designed to make explicit what is shared (and what is not shared); this is very important because it makes the whole world of difference in lazy evaluation. The notation also happens to be close to how most compilers and interpreters represent expressions.

A typical expression consists of mostly function applications and constructors, like let x=blah in f x : g x. In the extreme, I may draw a graph in which a node stands for each function application or constructor, with explicit pointers to functions and arguments, like this:

f        :   g      blah

This extreme takes a lot of space and a lot of work. More often, I like to merge most nodes into one, drawing separate nodes for shared things only (in this example blah), like this:

f  : g   blah

Furthermore, I don't draw separate nodes for lambda-bound variables, though they may occur multiple times. So for example, I leave (\x -> f x : g x) as it is. Lambda-bound variables represent holes to be filled in, rather than pointers to more nodes.

I call these Inlined Expression Graphs: they are like expression graphs, but I inline (merge) unshared nodes to save space and work.

This notation is similar to that in the book Introduction to Functional Programming using Haskell, 2nd edition, by Richard Bird. I add boxes around nodes; I also have slightly more separate nodes.

There is a text-only notation in the paper A Natural Semantics for Lazy Evaluation, by John Launchbury. It is like using a huge let-expression to represent an expression graph, except that it does away with the keywords “let” and “in”. It binds names to nodes (except the root), so pointers can be serialized as target node names. It remains to write down the binding of names to node contents, and to distinguish the root. So for example, let x=3*4 in x+x is written as:

{x↦t*f, t↦3, f↦4} : x+x

(It uses the convention that every argument must be a separate node.)

Likewise, the book The Haskell School of Expression, by Paul Hudak, uses a text-only notation based on where-clauses and bindings when sharing is important, e.g., x+x where x=3*4. I used to use this notation when contributing to the Haskell Wiki a long time ago.

I now decide against a text-only notation because I think that most readers like to see pictures when sharing happens.

Notation: Structured Derivation

I write evaluation steps from start to end like this:

stuff0
{why}
stuff1
{why}
stuff2

Sometimes, evaluation requires zooming into a sub-expression and processing it, before zooming out and continuing with the parent expression. I indicate this by an indented, nested evaluation, like this:

stuff0 blah
{why zoom in}
blah
{why}
yak
stuff0 yak
{why}
stuff1

This format is a fragment of Structured Derivation by Ralph-Johan Back. The most important feature I use here is nesting, which is what “structured” refers to. Most other presentations are unnested (including Bird's, Hudak's, all webpages before this one; fortunately, Launchbury's has nesting), i.e.,

stuff0 blah
{why blah becomes yak}
stuff0 yak
{why}
stuff1

A nested presentation is superior in several ways. It separates the two concerns—which rule triggers zooming into blah, which rule transforms blah—and explains them separately. It cuts the bloat and confusion of copying the parent expression over and over again when the sub-expression takes many steps to process. It is closer to what most compilers and interpreters do: when zooming in, one stack unit is reserved; when zooming out, one stack unit is freed. Thus, nesting depth is stack depth, and it completely explains stack overflows.

Function Application (without patterns)

The evaluation rule for function application: evaluate the function until it is a lambda term, then plug in the arguments. Multiple occurrences of a formal argument must all point to the same actual argument node (sharing).

Example: (\c -> (c, 0)) blah

(\c -> (c, 0)) blah
{function application}
(blah, 0)

Example: (\c -> (c, c)) blah

(\c -> (c, c)) blah
{function application}
blah (  ,  )

The above examples are easy because of two simplifications: the function is already a lambda term, and there is only one argument. The following example shows the first complication, where the function needs its own processing first; it uses an evaluation rule covered later.

Example: (if True then (\c -> (c, 0)) else (\c -> (c, 1))) blah

(if True then (\c -> (c, 0)) else (\c -> (c, 1))) blah
{need function}
(if True then (\c -> (c, 0)) else (\c -> (c, 1)))
{if-then-else}
(\c -> (c, 0))
(\c -> (c, 0)) blah
{function application}
(blah, 0)

The second complication is about “multiple” arguments like f x y. Technically, this is (f x) y, i.e., a function application with one argument y, but the function is yet another function application.

Example: (\c _ -> c) True (3*4+3*4)

(\c _ -> c) True (3*4+3*4)
{need function}
(\c _ -> c) True
{function application}
(\_ -> True)
(\_ -> True) (3*4+3*4)
{function application}
True

It is a hassle to add N-1 nesting levels for what can be casually read as 1 function and N arguments. In practice, I will shorten the evaluation steps to plugging in all N arguments at once. This shortcut is safe for asymptotic complexity because the Haskell type system limits the number of arguments below program size. So this shortcut does not mistake a Θ(n) computation for a Θ(1) computation; it only hides a constant factor. This shortcut is also taken by most compilers and interpreters.

Example: (\c _ -> c) True (3*4+3*4)

(\c _ -> c) True (3*4+3*4)
{function application}
True

If the function is a name with a definition elsewhere, I have two choices: I could spend a step turning the name into the corresponding lambda term, and then plug in the arguments; or I could just jump to plugging in the arguments. So here is another shortcut: I will jump to plugging in the arguments. It is less clutter, especially when later I cover functions defined by pattern matching. It is safe for asymptotic complexity. It is also close to most compilers and interpreters.

Example: const True (3*4+3*4)
given: const c _ = c

const True (3*4+3*4)
{function application}
True

A later section covers function applications in which the functions are defined by pattern matching.

Built-in Numbers; Characters

The evaluation rule for comparison and arithmetic of built-in numeric types (Int, Integer, Float, Double): evaluate the operands (in operand order, usually) until they are final answers, then evaluate the expression in question until it is a final answer. For example: to evaluate e+f, evaluate e, then evaluate f, then evaluate e+f.

(For user-defined types, evaluation order and eagerness depend on implementation, since comparison and arithmetic for them are also user-defined via the type classes Eq, Ord, Enum, Num, etc. User-defined code can do anything it likes. There are well-known user-defined naturals with partly lazy comparison and fully lazy succ, for example.)

Example: 3*4+3*4

3*4+3*4
{first operand not done}
3*4
{both operands done, arithmetic}
12
12+3*4
{second operand not done}
3*4
{both operands done, arithmetic}
12
12+12
{both operands done, arithmetic}
24

Example: 0*(3+4)

0*(3+4)
{second operand not done}
3+4
{arithmetic}
7
0*7
{arithmetic}
0

The evaluation rule for most operations on Char (comparison, most Enum operations, conversion from/to Int) is similar: evaluate the operands (in operand order, usually) until they are final answers, then evaluate the expression in question until it is a final answer.

Example: succ 'g' == pred (pred 'j')

succ 'g' == pred (pred 'j')
{first operand not done}
succ 'g'
{Char succ}
'h'
'h' == pred (pred 'j')
{second operand not done}
pred (pred 'j')
{operand not done}
pred 'j'
{Char pred}
'i'
pred 'i'
{Char pred}
'h'
'h' == 'h'
{Char comparison}
True

Pattern Matching (Level 1) and Conditional

if-then-else can be regarded as a special case of pattern matching. I cover if-then-else first—it's easy.

The evaluation rule for if b then et else ef: evaluate b until it becomes True or False, then go with et or ef, respectively.

Example: if 2>2 then () else ()

if 2>2 then () else ()
{need condition}
2>2
{arithmetic}
False
if False then () else ()
{if-then-else}
()

The evaluation rule for case e of {…}: evaluate the argument e enough to find the first pattern that match (and enough to see that it does not match those patterns before), then bind variables in the chosen pattern, and go with the chosen branch; or to find no match, then it's a runtime error.

There are a lot of variations in how much evaluation is enough and how much binding and sharing happens, depending on the patterns. In this section and several case studies shortly after, all patterns have at most one level, e.g., there are Just x and x:xs, but none nested like Left (Just x : (Nothing : ys)). There are also no guards. This is why the section title says “Level 1”; we shall accomplish higher levels later! So this section just focuses on level 1 patterns and sharing.

With level 1 patterns, the argument e is evaluated just until its outermost constructor is exposed, and then we know which branch to choose.

This is right for types defined by data, but not quite right for those by newtype. I will cover newtype in the future.

Example: case const (Just a) b of { Just _ -> True; Nothing -> False }

case const (Just a) b of { Just _ -> True; Nothing -> False }
{evaluate argument for pattern Just _}
const (Just a) b
{function application}
Just a
case Just a of { Just _ -> True; Nothing -> False }
{match pattern}
True

A level 1 pattern could also be a Char or numeric literal. Matching against it boils down to comparison using (==), i.e., the argument is evaluated as much as (==) specifies. Thus, for Char and built-in numbers, evaluate until the final answer; for user-defined numbers, it depends on user-defined (==).

Example: case 3*4 of { 0 -> True; _ -> False }

case 3*4 of { 0 -> True; _ -> False }
{evaluate argument for pattern 0}
3*4
{arithmetic}
12
case 12 of { 0 -> True; _ -> False }
{match pattern}
False

There are also level 0 patterns: just a variable or the _ wildcard. The argument is unevaluated to match them.

Example: case 3*4 of { x -> True }

case 3*4 of { x -> True }
{match pattern}
True

Variables in patterns are similar to lambda-bound variables: They are holes to be filled with pointers, and when you fill them in and go with the branch, multiple occurrences point to the same node (sharing).

Example: case Just (3*4) of { Just x -> (x,x); Nothing -> (0,0) }

case Just (3*4) of { Just x -> (x,x); Nothing -> (0,0) }
{match pattern}
(  ,  ) 3*4

Example: case Just (3*4) of { v@(Just x) -> (x,v); Nothing -> undefined }

case 3*4 Just   of { v@(Just x) -> (x,v); Nothing -> undefined }
{match pattern}
(  ,  ) 3*4 Just  

In the graph above, I draw some nodes explicitly to subtly indicate this process. We have a Just node, and it has pointer to a 3*4 node. We give the Just node to case, and that's what v receives. Whatever our Just node points to, that's what x receives. This explains why 3*4 is shared. The following example further emphasizes that v also shares existing nodes rather than pointing to new nodes.

Example: let j = Just (3*4) in case j of { v@(Just x) -> (x, v, j) }

3*4 Just   case  of { v@(Just x) -> (x, v,    ) }
{match pattern}
(  ,  ,  ) 3*4 Just  

Function Application (with patterns)

The evaluation rule for function application, where the function is defined elsewhere by pattern-matching with 1 argument: similar to a case expression, evaluate the argument enough to find the first match, then bind variables and go with the chosen RHS; or to find no match, then abort.

Example: null (iterate id 'c')
given these definitions:

null [] = True
null (_:_) = False
iterate f x = x : iterate f (f x)
null (iterate id 'c')
{evaluate argument for pattern []}
iterate id 'c'
{function application}
'c' : iterate id 'c'
null ('c' : iterate id 'c')
{match pattern, function application}
False

(Technically I should draw a graph to show the sharing of 'c'.)

The evaluation rule for function application, where the function takes several arguments and is defined by pattern-matching, is fairly delicate to describe because the order of evaluating the arguments is delicate.

Presumably, the function definition consists of a sequence of equations. At function application, the equations are tried in the sequence order.

  1. To try the first equation:
    1. Evaluate the 1st argument until we know it matches or it doesn't match. If it doesn't match, jump below to try the second equation.
    2. Evaluate the 2nd argument until we know it matches of it doesn't match. If it doesn't match, jump below to try the second equation.
    3. Etc.
    4. If all arguments match, bind variables.
    5. If the equation comes with no guard, go with its RHS. Otherwise…
    6. If there are guards, evaluate them in sequence until the first guard that evaluates to True, and go with its RHS. If all evaluate to False, jump below to try the second equation.
  2. To try the second equation:
    1. Evaluate the 1st argument until we know it matches or it doesn't match. If it doesn't match, jump below to try the third equation.
    2. Evaluate the 2nd argument until we know it matches of it doesn't match. If it doesn't match, jump below to try the third equation.
  3. To try the third equation: …
  4. If none of the equations is chosen, it's a runtime error.

An important aspect is that arguments could be more evaluated than the chosen equation requires because there are equations before it, and they have more specifc patterns or more demanding guards, and we need that much evaluation to decide to skip them.

Example: f [] (r 'y')
given:

r x = x : r x
f [] _ = 0
f _ [] = 1
f [] (r 'y')
{match pattern, function application}
0

Example: f (r 'y') []

f (r 'y') []
{evaluate 1st argument for pattern []}
r 'y'
{function application}
'y' : r 'y'
f ('y' : r 'y') []
{match pattern, function application}
1

Example: g (r 'x') (r 'y')
given:

r x = x : r x
g (x:xs) (y:ys) | x==y = 0
g _ _ = 1
g (r 'x') (r 'y')
{evaluate arguments for 1st g equation}
r 'x'
{function application}
'x' : r 'x'
r 'y'
{function application}
'y' : r 'y'
'x' == 'y'
{Char equality}
False
g ('x' : r 'x') ('y' : r 'y')
{match pattern, function application}
1

Function definitions that not only have patterns and guards but also where bindings can have very elaborate sharing structures. I will show them in the future when I find a good notation.

Cost Accounting

The most important purpose of knowing lazy evaluation is to explain asymptotic complexity, especially exhaustions such as stack overflow and heap leak.

Asymptotic time complexity is the number of steps in the evaluation sequence.

Asymptotic stack complexity is the depth of nesting in the evaluation sequence (how many times I zoom into sub-expressions).

Asymptotic heap complexity can be gleaned from expression sizes in the evaluation sequence, but it may take some work when there are zoom-in steps.

When there is no zooming in, every step has the complete expression graph, and so the graph size is heap size. Of course, I inline some nodes, so count both node numbers and string lengths to be sure. Richard Bird's book has another good way to count: count actual arguments. All are asymptotically accurate counts, differing in constant overheads only.

When there is zooming in, in some cases none of the steps shows the complete expression graph at the time. So you have to imagine a flattened presetation for the complete graph. Here is an example excerpted from the foldr case study later:

f (1:2:3:[])
{match pattern, function application}
1 + f (2:3:[])
{need right operand}
f (2:3:[])
{match pattern, function application}
2 + f (3:[])
{need right operand}
f (3:[])
{match pattern, function application}
3 + f []
{need right operand}
f []
{match pattern, function application}
0

At this point, the complete graph is 1 + (2 + (3 + 0)). The total number of actual arguments is 6 (3 binary operators). In the foldr case study, if you start with a list of length n, then in the middle the complete graph has 2n actual arguments, i.e., Θ(n) heap space.

Case Study: foldr

It is a good exercise to sum numbers (and do other tasks) with foldr, though we know it is fairly expensive.

foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
foldr (+) 0 (1:2:3:[])
{match pattern, function application}
1 + foldr (+) 0 (2:3:[])
{need right operand}
foldr (+) 0 (2:3:[])
{match pattern, function application}
2 + foldr (+) 0 (3:[])
{need right operand}
foldr (+) 0 (3:[])
{match pattern, function application}
3 + foldr (+) 0 []
{need right operand}
foldr (+) 0 []
{match pattern, function application}
0
3 + 0
{arithmetic}
3
2 + 3
{arithmetic}
5
1 + 5
{arithmetic}
6

This takes Θ(n) stack space; a long list will cause a stack overflow. Moreover, at the darkest moment, the whole graph for 1+(2+(3+0)) exists in the heap. This takes Θ(n) heap space.

Lest you think foldr is always a hog, here is a different example.

True || _ = True
False || x = x
foldr (||) False (True:False:False:[])
{match pattern, function application}
True || foldr (||) False (False:False:[])
{match pattern, function application}
True

If the binary operation is lazy in the 2nd argument, foldr is not a hog.

Case Study: foldl

You have seen how foldr fails at summing numbers. Lest you think foldl is any better:

foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
foldl (+) 0 (1:2:3:[])
{match pattern, function application}
foldl (+) (0+1) (2:3:[])
{match pattern, function application}
foldl (+) ((0+1)+2) (3:[])
{match pattern, function application}
foldl (+) (((0+1)+2)+3) []
{match pattern, function application}
((0+1)+2)+3
{need left operand}
(0+1)+2
{need left operand}
0+1
{arithmetic}
1
1+2
{arithmetic}
3
3+3
{arithmetic}
6

This uses Θ(n) stack space and Θ(n) heap space. So-called “tail call” is utterly irrelevant. After reading this example, please do me a favour: never ask about “tail call” again.

A better way to sum numbers is covered in the future, involving eagerness annotations.

Case Study: Average

Even if we have an efficient function for summing—call it goodsum—the next surprise comes when computing averages using goodsum 0 xs / length xs.

Assume that goodsum behaves like this in general:

goodsum 5 (10:xs)
{match pattern, function application}
goodsum (5+10) xs
{magic}
goodsum 15 xs
etc

and these definitions for the rest:

average xs = goodsum 0 xs / length xs
genlist 0 = []
genlist n = n : genlist (n-1)

I won't need a definition for length because I won't get that far.

average (genlist 3)
{function application}
goodsum 0  / length   genlist 3
{evaluate left operand}
goodsum 0 (genlist 3)
{evaluate operand for pattern}
genlist 3
{match pattern, function application}
3 : genlist (3-1)
goodsum 0 (3 : genlist (3-1))
{magic}
goodsum 3 (genlist (3-1))
{evaluate operand for pattern}
genlist (3-1)
{evaluate operand, match pattern, function application}
2 : genlist (2-1)
goodsum 3 (2 : genlist (2-1))
{magic}
goodsum 5 (genlist (2-1))
{evaluate operand for pattern}
genlist (2-1)
{evaluate operand, match pattern, function application}
1 : genlist (1-1)
goodsum 5 (1 : genlist (1-1))
{magic}
goodsum 6 (genlist (1-1))
{evaluate operand for pattern}
genlist (1-1)
{evaluate operand, match pattern, function application}
[]
goodsum 6 []
{match pattern, function application}
6
6 / length   3:2:1:[]
etc

The effect is that due to sharing, although goodsum tries to forget pieces of the list as it goes, some other part of the parent expression has a pointer to the list and keeps it. Before goodsum begins, the pointer points to just a short genlist 3; when goodsum finishes, the pointer points to a long list, which occupies Θ(n) heap space. Some people consider it a heap leak.

To show this effect more clearly, I give a flat presentation:

average (genlist 3)
{function application}
goodsum 0  / length   genlist 3
{evaluate genlist}
goodsum 0  / length   genlist (3-1) 3:  
{evaluate goodsum}
goodsum 3  / length   genlist (3-1) 3:  
{evaluate genlist}
goodsum 3  / length   genlist (2-1) 2:   3:  
{evaluate goodsum}
goodsum 5  / length   genlist (2-1) 2:   3:  
{evaluate genlist}
goodsum 5  / length   genlist (1-1) 1:   2:   3:  
{evaluate goodsum}
goodsum 6  / length   genlist (1-1) 1:   2:   3:  
{evaluate genlist}
goodsum 6  / length   [] 1:   2:   3:  
{evaluate goodsum}
6 / length   [] 1:   2:   3:  
etc

Thus heap size has grown to Θ(n).

Case Study: Co-Routines

Lest you think all lazy evaluation does is taking up a lot of memory, the following examples show how to use lazy evaluation properly to obtain compositional, asymptotically optimal programs.

Again, assume that goodsum behaves like this in general:

goodsum 5 (10:xs)
{match pattern, function application}
goodsum (5+10) xs
{magic}
goodsum 15 xs
etc

and this definition:

genlist 0 = []
genlist n = n : genlist (n-1)

Evaluate goodsum 0 (genlist 3) :

goodsum 0 (genlist 3)
{evaluate operand for pattern}
genlist 3
{match pattern, function application}
3 : genlist (3-1)
goodsum 0 (3 : genlist (3-1))
{magic}
goodsum 3 (genlist (3-1))
{evaluate operand for pattern}
genlist (3-1)
{evaluate operand, match pattern, function application}
2 : genlist (2-1)
goodsum 3 (2 : genlist (2-1))
{magic}
goodsum 5 (genlist (2-1))
{evaluate operand for pattern}
genlist (2-1)
{evaluate operand, match pattern, function application}
1 : genlist (1-1)
goodsum 5 (1 : genlist (1-1))
{magic}
goodsum 6 (genlist (1-1))
{evaluate operand for pattern}
genlist (1-1)
{evaluate operand, match pattern, function application}
[]
goodsum 6 []
{match pattern, function application}
6

For a while, it goes like the Average example. The important point this time is that, as no one else holds a pointer to the generated list, its cells are short-lived, like virtual particles. This program takes Θ(1) memory. Some compilers even optimize away the short-lived cells.

This shows an important idiom of using lazy evaluation: We have a producer, genlist 3, and we have a consumer, goodsum 0, and when put together, they behave like a pair of co-routines, passing control to the producer for one cell, then passing control to the consumer to use and lose that cell, then passing control to the producer for the next cell… We say that in Haskell we use lists as our for-loops.

The above example still does not do justice to the idiom. The idiom really shines when the consumer stops consuming early for whatever reason it feels like. The producer is not burdened with the stopping condition of the consumer. We get a cleaner separation. To be continued.


I have more Haskell Notes and Examples