O'Reilly logo

Real World Haskell by Donald Bruce Stewart, Bryan O'Sullivan, John Goerzen

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Code Reuse Through Composition

It seems a shame to introduce a new function, suffixes, that does almost the same thing as the existing tails function. Surely we can do better?

Recall the init function we introduced in Working with Lists—it returns all but the last element of a list:

-- file: ch04/SuffixTree.hs
suffixes2 xs = init (tails xs)

This suffixes2 function behaves identically to suffixes, but it’s a single line of code:

ghci> suffixes2 "foo"

If we take a step back, we see the glimmer of a pattern. We’re applying a function, then applying another function to its result. Let’s turn that pattern into a function definition:

-- file: ch04/SuffixTree.hs
compose :: (b -> c) -> (a -> b) -> a -> c
compose f g x = f (g x)

We now have a function, compose, that we can use to glue two other functions together:

-- file: ch04/SuffixTree.hs
suffixes3 xs = compose init tails xs

Haskell’s automatic currying lets us drop the xs variable, so we can make our definition even shorter:

-- file: ch04/SuffixTree.hs
suffixes4 = compose init tails

Fortunately, we don’t need to write our own compose function. Plugging functions into each other like this is so common that the Prelude provides function composition via the (.) operator:

-- file: ch04/SuffixTree.hs
suffixes5 = init . tails

The (.) operator isn’t a special piece of language syntax—it’s just a normal operator:

ghci> :type (.)
(.) :: (b -> c) -> (a -> b) -> a -> c
ghci> :type suffixes
suffixes :: [a] -> [[a]]
ghci> :type suffixes5
suffixes5 :: [a] -> [[a]]
ghci> suffixes5 "foo"

We can create new functions at any time by writing chains of composed functions, stitched together with (.), so long (of course) as the result type of the function on the right of each (.) matches the type of parameter that the function on the left can accept.

As an example, let’s solve a simple puzzle. Count the number of words in a string that begins with a capital letter:

ghci> :module +Data.Char
ghci> let capCount = length . filter (isUpper . head) . words
ghci> capCount "Hello there, Mom!"

We can understand what this composed function does by examining its pieces. The (.) function is right-associative, so we will proceed from right to left:

ghci> :type words
words :: String -> [String]

The words function has a result type of [String], so whatever is on the left side of (.) must accept a compatible argument:

ghci> :type isUpper . head
isUpper . head :: [Char] -> Bool

This function returns True if a word begins with a capital letter (try it in ghci), so filter (isUpper . head) returns a list of Strings containing only words that begin with capital letters:

ghci> :type filter (isUpper . head)
filter (isUpper . head) :: [[Char]] -> [[Char]]

Since this expression returns a list, all that remains is to calculate the length of the list, which we do with another composition.

Here’s another example, drawn from a real application. We want to extract a list of macro names from a C header file shipped with libpcap, a popular network packet-filtering library. The header file contains a large number definitions of the following form:

#define DLT_EN10MB      1       /* Ethernet (10Mb) */
#define DLT_EN3MB       2       /* Experimental Ethernet (3Mb) */
#define DLT_AX25        3       /* Amateur Radio AX.25 */

Our goal is to extract names such as DLT_EN10MB and DLT_AX25:

-- file: ch04/dlts.hs
import Data.List (isPrefixOf)

dlts :: String -> [String]

dlts = foldr step [] . lines

We treat an entire file as a string, split it up with lines, and then apply foldr step [] to the resulting list of lines. The step helper function operates on a single line:

-- file: ch04/dlts.hs
  where step l ds
          | "#define DLT_" `isPrefixOf` l = secondWord l : ds
          | otherwise                     = ds
        secondWord = head . tail . words

If we match a macro definition with our guard expression, we cons the name of the macro onto the head of the list we’re returning; otherwise, we leave the list untouched.

While the individual functions in the body of secondWord are familiar to us by now, it can take a little practice to piece together a chain of compositions such as this. Let’s walk through the procedure.

Once again, we proceed from right to left. The first function is words:

ghci> :type words
words :: String -> [String]
ghci> words "#define DLT_CHAOS    5"

We then apply tail to the result of words:

ghci> :type tail
tail :: [a] -> [a]
ghci> tail ["#define","DLT_CHAOS","5"]
ghci> :type tail . words
tail . words :: String -> [String]
ghci> (tail . words) "#define DLT_CHAOS    5"

Finally, applying head to the result of drop 1 . words will give us the name of our macro:

ghci> :type head . tail . words
head . tail . words :: String -> String
ghci> (head . tail . words) "#define DLT_CHAOS    5"

Use Your Head Wisely

After warning against unsafe list functions earlier in this chapter in Safely and Sanely Working with Crashy Functions, here we are calling both head and tail, two of those unsafe list functions. What gives?

In this case, we can assure ourselves by inspection that we’re safe from a runtime failure. The pattern guard in the definition of step contains two words, so when we apply words to any string that makes it past the guard, we’ll have a list of at least two elements: "#define" and some macro beginning with "DLT_".

This is the kind of reasoning we ought to do to convince ourselves that our code won’t explode when we call partial functions. Don’t forget our earlier admonition: calling unsafe functions such as this requires care and can often make our code more fragile in subtle ways. If for some reason we modified the pattern guard to only contain one word, we could expose ourselves to the possibility of a crash, as the body of the function assumes that it will receive two words.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required