The `foldl`

function that we discussed earlier is not the only place where
space leaks can happen in Haskell code. We will use it to illustrate how
nonstrict evaluation can sometimes be problematic and how to solve the
difficulties that can arise.

It is perfectly reasonable to skip this
section until you encounter a space leak “in the wild.”
Provided you use `foldr`

if you are
generating a list, and `foldl'`

instead of `foldl`

otherwise, space
leaks are unlikely to bother you in practice for a while.

We refer to an expression that is not
evaluated lazily as *strict*, so `foldl'`

is a strict left fold. It bypasses
Haskell’s usual nonstrict evaluation through the use of a
special function named `seq`

:

-- file: ch04/Fold.hs foldl' _ zero [] = zero foldl' step zero (x:xs) = let new = step zero x in new `seq` foldl' step new xs

This `seq`

function has a peculiar type, hinting
that it is not playing by the usual rules:

`ghci>`

seq :: a -> t -> t`:type seq`

It operates as follows: when a
`seq`

expression is evaluated, it
forces its first argument to be evaluated, and then returns its second
argument. It doesn’t actually do anything with the first argument.
`seq`

exists solely as a way to
force that value to be evaluated. Let’s walk through a brief
application to see what happens:

-- file: ch04/Fold.hs foldl' (+) 1 (2:[])

-- file: ch04/Fold.hs let new = 1 + 2 in new `seq` foldl' (+) new []

The use of `seq`

forcibly evaluates
`new`

to `3`

and returns its second
argument:

-- file: ch04/Fold.hs foldl' (+) 3 []

We end up with the following result:

-- file: ch04/Fold.hs 3

Without some direction, there is an
element of mystery to using `seq`

effectively. Here are some useful rules for using it well.

To have any effect, a `seq`

expression must be the first thing
evaluated in an expression:

-- file: ch04/Fold.hs -- incorrect: seq is hidden by the application of someFunc -- since someFunc will be evaluated first, seq may occur too late hiddenInside x y = someFunc (x `seq` y) -- incorrect: a variation of the above mistake hiddenByLet x y z = let a = x `seq` someFunc y in anotherFunc a z -- correct: seq will be evaluated first, forcing evaluation of x onTheOutside x y = x `seq` someFunc y

To strictly evaluate several values,
chain applications of `seq`

together:

-- file: ch04/Fold.hs chained x y z = x `seq` y `seq` someFunc z

A common mistake is to try to use `seq`

with two unrelated expressions:

-- file: ch04/Fold.hs badExpression step zero (x:xs) = seq (step zero x) (badExpression step (step zero x) xs)

Here, the apparent intention is to evaluate
`step zero x`

strictly. Since the expression is duplicated
in the body of the function, strictly evaluating the first instance of
it will have no effect on the second. The use of `let`

from the definition of `foldl'`

just shows illustrates how to
achieve this effect correctly.

When evaluating an expression, `seq`

stops as soon as it reaches a
constructor. For simple types such as numbers, this means that it will
evaluate them completely. Algebraic data types are a different story.
Consider the value `(1+2):(3+4):[]`

. If we apply `seq`

to this, it will evaluate the
`(1+2)`

thunk. Since it will stop when it reaches the first
`(:)`

constructor, it will have no effect on the second
thunk. The same is true for tuples: ```
seq ((1+2),(3+4))
True
```

will do nothing to the thunks inside the pair, since it
immediately hits the pair’s constructor.

If necessary, we can use normal functional programming techniques to work around these limitations:

-- file: ch04/Fold.hs strictPair (a,b) = a `seq` b `seq` (a,b) strictList (x:xs) = x `seq` x : strictList xs strictList [] = []

It is important to understand that `seq`

isn’t free: it has to perform a check
at runtime to see if an expression has been evaluated. Use it
sparingly. For instance, while our `strictPair`

function evaluates the contents
of a pair up to the first constructor, it adds the overheads of
pattern matching, two applications of `seq`

, and the construction of a new tuple.
If we were to measure its performance in the inner loop of a
benchmark, we might find that it slows down the program.

Aside from its performance cost if
overused, `seq`

is not a miracle
cure-all for memory consumption problems. Just because you
*can* evaluate something strictly doesn’t mean you
*should*. Careless use of `seq`

may do nothing at all, move existing
space leaks around, or introduce new leaks.

The best guides to whether `seq`

is necessary, and how well it is
working, are performance measurement and profiling, which we will
cover in Chapter 25. From a base of
empirical
measurement, you will develop a reliable sense of when `seq`

is most useful.

Start Free Trial

No credit card required