You are previewing Parallel and Concurrent Programming in Haskell.

Parallel and Concurrent Programming in Haskell

Cover of Parallel and Concurrent Programming in Haskell by Simon Marlow Published by O'Reilly Media, Inc.
  1. Special Upgrade Offer
  2. Preface
    1. Audience
    2. How to Read This Book
    3. Conventions Used in This Book
    4. Using Sample Code
    5. Safari® Books Online
    6. How to Contact Us
    7. Acknowledgments
  3. 1. Introduction
    1. Terminology: Parallelism and Concurrency
    2. Tools and Resources
    3. Sample Code
  4. I. Parallel Haskell
    1. 2. Basic Parallelism: The Eval Monad
      1. Lazy Evaluation and Weak Head Normal Form
      2. The Eval Monad, rpar, and rseq
      3. Example: Parallelizing a Sudoku Solver
      4. Deepseq
    2. 3. Evaluation Strategies
      1. Parameterized Strategies
      2. A Strategy for Evaluating a List in Parallel
      3. Example: The K-Means Problem
      4. GC’d Sparks and Speculative Parallelism
      5. Parallelizing Lazy Streams with parBuffer
      6. Chunking Strategies
      7. The Identity Property
    3. 4. Dataflow Parallelism: The Par Monad
      1. Example: Shortest Paths in a Graph
      2. Pipeline Parallelism
      3. Example: A Conference Timetable
      4. Example: A Parallel Type Inferencer
      5. Using Different Schedulers
      6. The Par Monad Compared to Strategies
    4. 5. Data Parallel Programming with Repa
      1. Arrays, Shapes, and Indices
      2. Operations on Arrays
      3. Example: Computing Shortest Paths
      4. Folding and Shape-Polymorphism
      5. Example: Image Rotation
      6. Summary
    5. 6. GPU Programming with Accelerate
      1. Overview
      2. Arrays and Indices
      3. Running a Simple Accelerate Computation
      4. Scalar Arrays
      5. Indexing Arrays
      6. Creating Arrays Inside Acc
      7. Zipping Two Arrays
      8. Constants
      9. Example: Shortest Paths
      10. Example: A Mandelbrot Set Generator
  5. II. Concurrent Haskell
    1. 7. Basic Concurrency: Threads and MVars
      1. A Simple Example: Reminders
      2. Communication: MVars
      3. MVar as a Simple Channel: A Logging Service
      4. MVar as a Container for Shared State
      5. MVar as a Building Block: Unbounded Channels
      6. Fairness
    2. 8. Overlapping Input/Output
      1. Exceptions in Haskell
      2. Error Handling with Async
      3. Merging
    3. 9. Cancellation and Timeouts
      1. Asynchronous Exceptions
      2. Masking Asynchronous Exceptions
      3. The bracket Operation
      4. Asynchronous Exception Safety for Channels
      5. Timeouts
      6. Catching Asynchronous Exceptions
      7. mask and forkIO
      8. Asynchronous Exceptions: Discussion
    4. 10. Software Transactional Memory
      1. Running Example: Managing Windows
      2. Blocking
      3. Blocking Until Something Changes
      4. Merging with STM
      5. Async Revisited
      6. Implementing Channels with STM
      7. An Alternative Channel Implementation
      8. Bounded Channels
      9. What Can We Not Do with STM?
      10. Performance
      11. Summary
    5. 11. Higher-Level Concurrency Abstractions
      1. Avoiding Thread Leakage
      2. Symmetric Concurrency Combinators
      3. Adding a Functor Instance
      4. Summary: The Async API
    6. 12. Concurrent Network Servers
      1. A Trivial Server
      2. Extending the Simple Server with State
      3. A Chat Server
    7. 13. Parallel Programming Using Threads
      1. How to Achieve Parallelism with Concurrency
      2. Example: Searching for Files
    8. 14. Distributed Programming
      1. The Distributed-Process Family of Packages
      2. Distributed Concurrency or Parallelism?
      3. A First Example: Pings
      4. Multi-Node Ping
      5. Typed Channels
      6. Handling Failure
      7. A Distributed Chat Server
      8. Exercise: A Distributed Key-Value Store
    9. 15. Debugging, Tuning, and Interfacing with Foreign Code
      1. Debugging Concurrent Programs
      2. Tuning Concurrent (and Parallel) Programs
      3. Concurrency and the Foreign Function Interface
    10. Index
  6. About the Author
  7. Colophon
  8. Special Upgrade Offer
  9. Copyright
O'Reilly logo

Part I. Parallel Haskell

Now that processor manufacturers have largely given up trying to squeeze more performance out of individual processors and have refocused their attention on providing us with more processors instead, the biggest gains in performance are to be had by using parallel techniques in our programs so as to make use of these extra cores. Parallel Haskell is aimed at providing access to multiple processors in a natural and robust way.

You might wonder whether the compiler could automatically parallelize programs for us. After all, it should be easier to do this in a purely functional language, where the only dependencies between computations are data dependencies, which are mostly perspicuous and thus readily analyzed. However, even in a purely functional language, automatic parallelization is thwarted by an age-old problem: To make the program faster, we have to gain more from parallelism than we lose due to the overhead of adding it, and compile-time analysis cannot make good judgments in this area. An alternative approach is to use runtime profiling to find good candidates for parallelization and to feed this information back into the compiler. Even this, however, has not been terribly successful in practice.

Fully automatic parallelization is still a pipe dream. However, the parallel programming models provided by Haskell do succeed in eliminating some mundane or error-prone aspects traditionally associated with parallel programming:

  • Parallel programming in Haskell is deterministic: The parallel program always produces the same answer, regardless of how many processors are used to run it. So parallel programs can be debugged without actually running them in parallel. Furthermore, the programmer can be confident that adding parallelism will not introduce lurking race conditions or deadlocks that would be hard to eliminate with testing.
  • Parallel Haskell programs are high-level and declarative and do not explicitly deal with concepts like synchronization or communication. The programmer indicates where the parallelism is, and the details of actually running the program in parallel are left to the runtime system. This is both a blessing and a curse:

    • By embodying fewer operational details, parallel Haskell programs are abstract and are therefore likely to work on a wide range of parallel hardware.
    • Parallel Haskell programs can take advantage of existing highly tuned technology in the runtime system, such as parallel garbage collection. Furthermore, the program gets to benefit from future improvements made to the runtime with no additional effort.
    • Because a lot of the details of execution are hidden, performance problems can be hard to understand. Moreover, the programmer has less control than he would in a lower-level programming language, so fixing performance problems can be tricky. Indeed, this problem is not limited to Parallel Haskell: It will be familiar to anyone who has tried to optimize Haskell programs at all. In this book, I hope to demonstrate how to identify and work around the most common issues that can occur in practice.

The main thing that the parallel Haskell programmer has to think about is partitioning: dividing up the problem into pieces that can be computed in parallel. Ideally, you want to have enough tasks to keep all the processors busy continuously. However, your efforts may be frustrated in two ways:

If you make your tasks too small, the overhead of managing the tasks outweighs any benefit you might get from running them in parallel. So granularity should be large enough to dwarf overhead, but not too large, because then you risk not having enough work to keep all the processors busy, especially toward the end of the execution when there are fewer tasks left.
Data dependencies
When one task depends on another, they must be performed sequentially. The first two programming models we will be encountering in this book take different approaches to data dependencies: In Chapter 3, data dependencies are entirely implicit, whereas in Chapter 4 they are explicit. Programming with explicit data dependencies is less concise, but it can be easier to understand and fix problems when the data dependencies are not hidden.

In the following chapters, we will describe the various parallel programming models that Haskell provides:

  • Chapters 2 and 3 introduce the Eval monad and Evaluation Strategies, which are suitable for expressing parallelism in Haskell programs that are not heavily numerical or array-based. These programming models are well established, and there are many good examples of using them to achieve parallelism.
  • Chapter 4 introduces the Par monad, a more recent parallel programming model that also aims at parallelizing ordinary Haskell code but with a different trade-off: It affords the programmer more control in exchange for some of the conciseness and modularity of Strategies.
  • Chapter 5 looks at the Repa library, which provides a rich set of combinators for building parallel array computations. You can express a complex array algorithm as the composition of several simpler operations, and the library automatically optimizes the composition into a single-pass algorithm using a technique called fusion. Furthermore, the implementation of the library automatically parallelizes the operation using the available processors.
  • Chapter 6 discusses programming with a graphics processing unit (GPU) using the Accelerate library, which offers a similar programming model to Repa but runs the computation directly on the GPU.

Parallelizing Haskell code can be a joyful experience: Adding a small annotation to your program can suddenly make it run several times faster on a multicore machine. It can also be a frustrating experience. As we’ll see over the course of the next few chapters, there are a number of pitfalls waiting to trap you. Some of these are Haskell-specific, and some are part and parcel of parallel programming in any language. Hopefully by the end you’ll have built up enough of an intuition for parallel programming that you’ll be able to achieve decent parallel speedups in your own code using the techniques covered.

Keep in mind while reading this part of the book that obtaining reliable results with parallelism is inherently difficult because in today’s complex computing devices, performance depends on a vast number of interacting components. For this reason, the results I get from running the examples on my computers might differ somewhat from the results you get on your hardware. Hopefully the difference isn’t huge—if it is, that might indicate a problem in GHC that you should report. The important thing is to be aware that performance is fragile, especially where parallelism is concerned.

The best content for your career. Discover unlimited learning on demand for around $1/day.