You are previewing Real World Haskell.

Real World Haskell

Cover of Real World Haskell by John Goerzen... Published by O'Reilly Media, Inc.
  1. Real World Haskell
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. A Note Regarding Supplemental Files
    3. Preface
      1. Have We Got a Deal for You!
      2. What to Expect from This Book
      3. What to Expect from Haskell
      4. A Brief Sketch of Haskell’s History
      5. Helpful Resources
      6. Conventions Used in This Book
      7. Using Code Examples
      8. Safari® Books Online
      9. How to Contact Us
      10. Acknowledgments
    4. 1. Getting Started
      1. Your Haskell Environment
      2. Getting Started with ghci, the Interpreter
      3. Basic Interaction: Using ghci as a Calculator
      4. Command-Line Editing in ghci
      5. Lists
      6. Strings and Characters
      7. First Steps with Types
      8. A Simple Program
    5. 2. Types and Functions
      1. Why Care About Types?
      2. Haskell’s Type System
      3. What to Expect from the Type System
      4. Some Common Basic Types
      5. Function Application
      6. Useful Composite Data Types: Lists and Tuples
      7. Functions over Lists and Tuples
      8. Function Types and Purity
      9. Haskell Source Files, and Writing Simple Functions
      10. Understanding Evaluation by Example
      11. Polymorphism in Haskell
      12. The Type of a Function of More Than One Argument
      13. Why the Fuss over Purity?
      14. Conclusion
    6. 3. Defining Types, Streamlining Functions
      1. Defining a New Data Type
      2. Type Synonyms
      3. Algebraic Data Types
      4. Pattern Matching
      5. Record Syntax
      6. Parameterized Types
      7. Recursive Types
      8. Reporting Errors
      9. Introducing Local Variables
      10. The Offside Rule and Whitespace in an Expression
      11. The case Expression
      12. Common Beginner Mistakes with Patterns
      13. Conditional Evaluation with Guards
    7. 4. Functional Programming
      1. Thinking in Haskell
      2. A Simple Command-Line Framework
      3. Warming Up: Portably Splitting Lines of Text
      4. Infix Functions
      5. Working with Lists
      6. How to Think About Loops
      7. Anonymous (lambda) Functions
      8. Partial Function Application and Currying
      9. As-patterns
      10. Code Reuse Through Composition
      11. Tips for Writing Readable Code
      12. Space Leaks and Strict Evaluation
    8. 5. Writing a Library: Working with JSON Data
      1. A Whirlwind Tour of JSON
      2. Representing JSON Data in Haskell
      3. The Anatomy of a Haskell Module
      4. Compiling Haskell Source
      5. Generating a Haskell Program and Importing Modules
      6. Printing JSON Data
      7. Type Inference Is a Double-Edged Sword
      8. A More General Look at Rendering
      9. Developing Haskell Code Without Going Nuts
      10. Pretty Printing a String
      11. Arrays and Objects, and the Module Header
      12. Writing a Module Header
      13. Fleshing Out the Pretty-Printing Library
      14. Creating a Package
      15. Practical Pointers and Further Reading
    9. 6. Using Typeclasses
      1. The Need for Typeclasses
      2. What Are Typeclasses?
      3. Declaring Typeclass Instances
      4. Important Built-in Typeclasses
      5. Automatic Derivation
      6. Typeclasses at Work: Making JSON Easier to Use
      7. Living in an Open World
      8. How to Give a Type a New Identity
      9. JSON Typeclasses Without Overlapping Instances
      10. The Dreaded Monomorphism Restriction
      11. Conclusion
    10. 7. I/O
      1. Classic I/O in Haskell
      2. Working with Files and Handles
      3. Extended Example: Functional I/O and Temporary Files
      4. Lazy I/O
      5. The IO Monad
      6. Is Haskell Really Imperative?
      7. Side Effects with Lazy I/O
      8. Buffering
      9. Reading Command-Line Arguments
      10. Environment Variables
    11. 8. Efficient File Processing, Regular Expressions, and Filename Matching
      1. Efficient File Processing
      2. Filename Matching
      3. Regular Expressions in Haskell
      4. More About Regular Expressions
      5. Translating a glob Pattern into a Regular Expression
      6. An important Aside: Writing Lazy Functions
      7. Making Use of Our Pattern Matcher
      8. Handling Errors Through API Design
      9. Putting Our Code to Work
    12. 9. I/O Case Study: A Library for Searching the Filesystem
      1. The find Command
      2. Starting Simple: Recursively Listing a Directory
      3. A Naive Finding Function
      4. Predicates: From Poverty to Riches, While Remaining Pure
      5. Sizing a File Safely
      6. A Domain-Specific Language for Predicates
      7. Controlling Traversal
      8. Density, Readability, and the Learning Process
      9. Another Way of Looking at Traversal
      10. Useful Coding Guidelines
    13. 10. Code Case Study: Parsing a Binary Data Format
      1. Grayscale Files
      2. Parsing a Raw PGM File
      3. Getting Rid of Boilerplate Code
      4. Implicit State
      5. Introducing Functors
      6. Writing a Functor Instance for Parse
      7. Using Functors for Parsing
      8. Rewriting Our PGM Parser
      9. Future Directions
    14. 11. Testing and Quality Assurance
      1. QuickCheck: Type-Based Testing
      2. Testing Case Study: Specifying a Pretty Printer
      3. Measuring Test Coverage with HPC
    15. 12. Barcode Recognition
      1. A Little Bit About Barcodes
      2. Introducing Arrays
      3. Encoding an EAN-13 Barcode
      4. Constraints on Our Decoder
      5. Divide and Conquer
      6. Turning a Color Image into Something Tractable
      7. What Have We Done to Our Image?
      8. Finding Matching Digits
      9. Life Without Arrays or Hash Tables
      10. Turning Digit Soup into an Answer
      11. Working with Row Data
      12. Pulling It All Together
      13. A Few Comments on Development Style
    16. 13. Data Structures
      1. Association Lists
      2. Maps
      3. Functions Are Data, Too
      4. Extended Example: /etc/passwd
      5. Extended Example: Numeric Types
      6. Taking Advantage of Functions as Data
      7. General-Purpose Sequences
    17. 14. Monads
      1. Revisiting Earlier Code Examples
      2. Looking for Shared Patterns
      3. The Monad Typeclass
      4. And Now, a Jargon Moment
      5. Using a New Monad: Show Your Work!
      6. Mixing Pure and Monadic Code
      7. Putting a Few Misconceptions to Rest
      8. Building the Logger Monad
      9. The Maybe Monad
      10. The List Monad
      11. Desugaring of do Blocks
      12. The State Monad
      13. Monads and Functors
      14. The Monad Laws and Good Coding Style
    18. 15. Programming with Monads
      1. Golfing Practice: Association Lists
      2. Generalized Lifting
      3. Looking for Alternatives
      4. Adventures in Hiding the Plumbing
      5. Separating Interface from Implementation
      6. The Reader Monad
      7. A Return to Automated Deriving
      8. Hiding the IO Monad
    19. 16. Using Parsec
      1. First Steps with Parsec: Simple CSV Parsing
      2. The sepBy and endBy Combinators
      3. Choices and Errors
      4. Extended Example: Full CSV Parser
      5. Parsec and MonadPlus
      6. Parsing a URL-Encoded Query String
      7. Supplanting Regular Expressions for Casual Parsing
      8. Parsing Without Variables
      9. Applicative Functors for Parsing
      10. Applicative Parsing by Example
      11. Parsing JSON Data
      12. Parsing a HTTP Request
    20. 17. Interfacing with C: The FFI
      1. Foreign Language Bindings: The Basics
      2. Regular Expressions for Haskell: A Binding for PCRE
      3. Passing String Data Between Haskell and C
      4. Matching on Strings
    21. 18. Monad Transformers
      1. Motivation: Boilerplate Avoidance
      2. A Simple Monad Transformer Example
      3. Common Patterns in Monads and Monad Transformers
      4. Stacking Multiple Monad Transformers
      5. Moving Down the Stack
      6. Understanding Monad Transformers by Building One
      7. Transformer Stacking Order Is Important
      8. Putting Monads and Monad Transformers into Perspective
    22. 19. Error Handling
      1. Error Handling with Data Types
      2. Exceptions
      3. Error Handling in Monads
    23. 20. Systems Programming in Haskell
      1. Running External Programs
      2. Directory and File Information
      3. Program Termination
      4. Dates and Times
      5. Extended Example: Piping
    24. 21. Using Databases
      1. Overview of HDBC
      2. Installing HDBC and Drivers
      3. Connecting to Databases
      4. Transactions
      5. Simple Queries
      6. SqlValue
      7. Query Parameters
      8. Prepared Statements
      9. Reading Results
      10. Database Metadata
      11. Error Handling
    25. 22. Extended Example: Web Client Programming
      1. Basic Types
      2. The Database
      3. The Parser
      4. Downloading
      5. Main Program
    26. 23. GUI Programming with gtk2hs
      1. Installing gtk2hs
      2. Overview of the GTK+ Stack
      3. User Interface Design with Glade
      4. Event-Driven Programming
      5. Initializing the GUI
      6. The Add Podcast Window
      7. Long-Running Tasks
      8. Using Cabal
    27. 24. Concurrent and Multicore Programming
      1. Defining Concurrency and Parallelism
      2. Concurrent Programming with Threads
      3. Simple Communication Between Threads
      4. The Main Thread and Waiting for Other Threads
      5. Communicating over Channels
      6. Useful Things to Know About
      7. Shared-State Concurrency Is Still Hard
      8. Using Multiple Cores with GHC
      9. Parallel Programming in Haskell
      10. Parallel Strategies and MapReduce
    28. 25. Profiling and Optimization
      1. Profiling Haskell Programs
      2. Controlling Evaluation
      3. Understanding Core
      4. Advanced Techniques: Fusion
    29. 26. Advanced Library Design: Building a Bloom Filter
      1. Introducing the Bloom Filter
      2. Use Cases and Package Layout
      3. Basic Design
      4. The ST Monad
      5. Designing an API for Qualified Import
      6. Creating a Mutable Bloom Filter
      7. The Immutable API
      8. Creating a Friendly Interface
      9. Creating a Cabal Package
      10. Testing with QuickCheck
      11. Performance Analysis and Tuning
    30. 27. Sockets and Syslog
      1. Basic Networking
      2. Communicating with UDP
      3. Communicating with TCP
    31. 28. Software Transactional Memory
      1. The Basics
      2. Some Simple Examples
      3. STM and Safety
      4. Retrying a Transaction
      5. Choosing Between Alternatives
      6. I/O and STM
      7. Communication Between Threads
      8. A Concurrent Web Link Checker
      9. Practical Aspects of STM
    32. A. Installing GHC and Haskell Libraries
      1. Installing GHC
      2. Installing Haskell Software
    33. B. Characters, Strings, and Escaping Rules
      1. Writing Character and String Literals
      2. International Language Support
      3. Escaping Text
    34. Index
    35. About the Authors
    36. Colophon
    37. SPECIAL OFFER: Upgrade this ebook with O’Reilly
O'Reilly logo

Warming Up: Portably Splitting Lines of Text

Haskell provides a built-in function, lines, that lets us split a text string on line boundaries. It returns a list of strings with line termination characters omitted:

ghci> :type lines
lines :: String -> [String]
ghci> lines "line 1\nline 2"
["line 1","line 2"]
ghci> lines "foo\n\nbar\n"
["foo","","bar"]

While lines looks useful, it relies on us reading a file in text mode in order to work. Text mode is a feature common to many programming languages; it provides a special behavior when we read and write files on Windows. When we read a file in text mode, the file I/O library translates the line-ending sequence "\r\n" (carriage return followed by newline) to "\n" (newline alone), and it does the reverse when we write a file. On Unix-like systems, text mode does not perform any translation. As a result of this difference, if we read a file on one platform that was written on the other, the line endings are likely to become a mess. (Both readFile and writeFile operate in text mode.)

ghci> lines "a\r\nb"
["a\r","b"]

The lines function splits only on newline characters, leaving carriage returns dangling at the ends of lines. If we read a Windows-generated text file on a Linux or Unix box, we’ll get trailing carriage returns at the end of each line.

We have comfortably used Python’s universal newline support for years; this transparently handles Unix and Windows line-ending conventions for us. We would like to provide something similar in Haskell.

Since we are still early in our career of reading Haskell code, we will discuss our Haskell implementation in some detail:

-- file: ch04/SplitLines.hs
splitLines :: String -> [String]

Our function’s type signature indicates that it accepts a single string, the contents of a file with some unknown line-ending convention. It returns a list of strings, representing each line from the file:

-- file: ch04/SplitLines.hs
splitLines [] = []
splitLines cs =
    let (pre, suf) = break isLineTerminator cs
    in  pre : case suf of 
                ('\r':'\n':rest) -> splitLines rest
                ('\r':rest)      -> splitLines rest
                ('\n':rest)      -> splitLines rest
                _                -> []

isLineTerminator c = c == '\r' || c == '\n'

Before we dive into detail, notice first how we organized our code. We presented the important pieces of code first, keeping the definition of isLineTerminator until later. Because we have given the helper function a readable name, we can guess what it does even before we’ve read it, which eases the smooth flow of reading the code.

The Prelude defines a function named break that we can use to partition a list into two parts. It takes a function as its first parameter. That function must examine an element of the list and return a Bool to indicate whether to break the list at that point. The break function returns a pair, which consists of the sublist consumed before the predicate returned True (the prefix) and the rest of the list (the suffix):

ghci> break odd [2,4,5,6,8]
([2,4],[5,6,8])
ghci> :module +Data.Char
ghci> break isUpper "isUpper"
("is","Upper")

Since we need only to match a single carriage return or newline at a time, examining each element of the list one by one is good enough for our needs.

The first equation of splitLines indicates that if we match an empty string, we have no further work to do.

In the second equation, we first apply break to our input string. The prefix is the substring before a line terminator, and the suffix is the remainder of the string. The suffix will include the line terminator, if any is present.

The pre : expression tells us that we should add the pre value to the front of the list of lines. We then use a case expression to inspect the suffix, so we can decide what to do next. The result of the case expression will be used as the second argument to the (:) list constructor.

The first pattern matches a string that begins with a carriage return, followed by a newline. The variable rest is bound to the remainder of the string. The other patterns are similar, so they ought to be easy to follow.

A prose description of a Haskell function isn’t necessarily easy to follow. We can gain a better understanding by stepping into ghci and observing the behavior of the function in different circumstances.

Let’s start by partitioning a string that doesn’t contain any line terminators:

ghci> splitLines "foo"
["foo"]

Here, our application of break never finds a line terminator, so the suffix it returns is empty:

ghci> break isLineTerminator "foo"
("foo","")

The case expression in splitLines must thus be matching on the fourth branch, and we’re finished. What about a slightly more interesting case?

ghci> splitLines "foo\r\nbar"
["foo","bar"]

Our first application of break gives us a nonempty suffix:

ghci> break isLineTerminator "foo\r\nbar"
("foo","\r\nbar")

Because the suffix begins with a carriage return followed by a newline, we match on the first branch of the case expression. This gives us pre bound to "foo", and suf bound to "bar". We apply splitLines recursively, this time on "bar" alone:

ghci> splitLines "bar"
["bar"]

The result is that we construct a list whose head is "foo" and whose tail is ["bar"]:

ghci> "foo" : ["bar"]
["foo","bar"]

This sort of experimenting with ghci is a helpful way to understand and debug the behavior of a piece of code. It has an even more important benefit that is almost accidental in nature. It can be tricky to test complicated code from ghci, so we will tend to write smaller functions, which can further help the readability of our code.

This style of creating and reusing small, powerful pieces of code is a fundamental part of functional programming.

A Line-Ending Conversion Program

Let’s hook our splitLines function into the little framework that we wrote earlier. Make a copy of the InteractWith.hs source file; let’s call the new file FixLines.hs. Add the splitLines function to the new source file. Since our function must produce a single String, we must stitch the list of lines back together. The Prelude provides an unlines function that concatenates a list of strings, adding a newline to the end of each:

-- file: ch04/SplitLines.hs
fixLines :: String -> String
fixLines input = unlines (splitLines input)

If we replace the id function with fixLines, we can compile an executable that will convert a text file to our system’s native line ending:

$ ghc --make FixLines
[1 of 1] Compiling Main             ( FixLines.hs, FixLines.o )
Linking FixLines ...

If you are on a Windows system, find and download a text file that was created on a Unix system (for example, gpl-3.0.txt [http://www.gnu.org/licenses/gpl-3.0.txt]). Open it in the standard Notepad text editor. The lines should all run together, making the file almost unreadable. Process the file using the FixLines command you just created, and open the output file in Notepad. The line endings should now be fixed up.

On Unix-like systems, the standard pagers and editors hide Windows line endings, making it more difficult to verify that FixLines is actually eliminating them. Here are a few commands that should help:

$ file gpl-3.0.txt
gpl-3.0.txt: ASCII English text
$ unix2dos gpl-3.0.txt
unix2dos: converting file gpl-3.0.txt to DOS format ...
$ file gpl-3.0.txt
gpl-3.0.txt: ASCII English text, with CRLF line terminators

The best content for your career. Discover unlimited learning on demand for around $1/day.