You are previewing The Art of Multiprocessor Programming.
O'Reilly logo
The Art of Multiprocessor Programming

Book Description

As the computer industry changes from single-processor to multiprocessor architectures, this revolution requires a fundamental change in how programs are written. To leverage the performance and power of multiprocessor programming, also known as multicore programming, you need to learn the new principles, algorithms, and tools presented in this book. It includes fully-developed Java examples detailing data structures, synchronization techniques, transactional memory, and more.

Prof. Maurice Herlihy, who coined the phrase "transactional memory," is on the faculty of Brown University. He is the recipient of the 2003 Dijkstra Prize in distributed computing. Prof. Nir Shavit is on the faculty of Tel-Aviv University and a member of the technical staff at Sun Microsystems Laboratories. In 2004 they shared the Gödel Prize, the highest award in theoretical computer science.



  • The book on multicore programming, the new paradigm of computer science
  • Written by the world's most revered experts in multiprocessor programming and performance
  • Includes examples, models, exercises, PowerPoint slides, and sample Java programs

Table of Contents

  1. Copyright
    1. Dedication
  2. Acknowledgments
  3.  
  4. Preface
  5. 1. Introduction
    1. 1.1. Shared Objects and Synchronization
    2. 1.2. A Fable
      1. 1.2.1. Properties of Mutual Exclusion
      2. 1.2.2. The Moral
    3. 1.3. The Producer–Consumer Problem
    4. 1.4. The Readers–Writers Problem
    5. 1.5. The Harsh Realities of Parallelization
    6. 1.6. Parallel Programming
    7. 1.7. Chapter Notes
    8. 1.8. Exercises
  6. I. Principles
    1. 2. Mutual Exclusion
      1. 2.1. Time
      2. 2.2. Critical Sections
      3. 2.3. 2-Thread Solutions
        1. 2.3.1. The LockOne Class
        2. 2.3.2. The LockTwo Class
        3. 2.3.3. The Peterson Lock
      4. 2.4. The Filter Lock
      5. 2.5. Fairness
      6. 2.6. Lamport’s Bakery Algorithm
      7. 2.7. Bounded Timestamps
      8. 2.8. Lower Bounds on the Number of Locations
      9. 2.9. Chapter Notes
      10. 2.10. Exercises
    2. 3. Concurrent Objects
      1. 3.1. Concurrency and Correctness
      2. 3.2. Sequential Objects
      3. 3.3. Quiescent Consistency
        1. 3.3.1. Remarks
      4. 3.4. Sequential Consistency
        1. 3.4.1. Remarks
      5. 3.5. Linearizability
        1. 3.5.1. Linearization Points
        2. 3.5.2. Remarks
      6. 3.6. Formal Definitions
        1. 3.6.1. Linearizability
        2. 3.6.2. Compositional Linearizability
        3. 3.6.3. The Nonblocking Property
      7. 3.7. Progress Conditions
        1. 3.7.1. Dependent Progress Conditions
      8. 3.8. The Java Memory Model
        1. 3.8.1. Locks and Synchronized Blocks
        2. 3.8.2. Volatile Fields
        3. 3.8.3. Final Fields
      9. 3.9. Remarks
      10. 3.10. Chapter Notes
      11. 3.11. Exercises
    3. 4. Foundations of Shared Memory
      1. 4.1. The Space of Registers
      2. 4.2. Register Constructions
        1. 4.2.1. MRSW Safe Registers
        2. 4.2.2. A Regular Boolean MRSW Register
        3. 4.2.3. A Regular M-Valued MRSW Register
        4. 4.2.4. An Atomic SRSW Register
        5. 4.2.5. An Atomic MRSW Register
        6. 4.2.6. An Atomic MRMW Register
      3. 4.3. Atomic Snapshots
        1. 4.3.1. An Obstruction-Free Snapshot
        2. 4.3.2. A Wait-Free Snapshot
        3. 4.3.3. Correctness Arguments
      4. 4.4. Chapter Notes
      5. 4.5. Exercises
    4. 5. The Relative Power of Primitive Synchronization Operations
      1. 5.1. Consensus Numbers
        1. 5.1.1. States and Valence
      2. 5.2. Atomic Registers
      3. 5.3. Consensus Protocols
      4. 5.4. FIFO Queues
      5. 5.5. Multiple Assignment Objects
      6. 5.6. Read–Modify–Write Operations
      7. 5.7. Common2 RMW Operations
      8. 5.8. The compareAndSet() Operation
      9. 5.9. Chapter Notes
      10. 5.10. Exercises
    5. 6. Universality of Consensus
      1. 6.1. Introduction
      2. 6.2. Universality
      3. 6.3. A Lock-Free Universal Construction
      4. 6.4. A Wait-Free Universal Construction
      5. 6.5. Chapter Notes
      6. 6.6. Exercises
  7. II. Practice
    1. 7. Spin Locks and Contention
      1. 7.1. Welcome to the Real World
      2. 7.2. Test-And-Set Locks
      3. 7.3. TAS-Based Spin Locks Revisited
      4. 7.4. Exponential Backoff
      5. 7.5. Queue Locks
        1. 7.5.1. Array-Based Locks
        2. 7.5.2. The CLH Queue Lock
        3. 7.5.3. The MCS Queue Lock
      6. 7.6. A Queue Lock with Timeouts
      7. 7.7. A Composite Lock
        1. 7.7.1. A Fast-Path Composite Lock
      8. 7.8. Hierarchical Locks
        1. 7.8.1. A Hierarchical Backoff Lock
        2. 7.8.2. A Hierarchical CLH Queue Lock
      9. 7.9. One Lock To Rule Them All
      10. 7.10. Chapter Notes
      11. 7.11. Exercises
    2. 8. Monitors and Blocking Synchronization
      1. 8.1. Introduction
      2. 8.2. Monitor Locks and Conditions
        1. 8.2.1. Conditions
        2. 8.2.2. The Lost-Wakeup Problem
      3. 8.3. Readers–Writers Locks
        1. 8.3.1. Simple Readers–Writers Lock
        2. 8.3.2. Fair Readers–Writers Lock
      4. 8.4. Our Own Reentrant Lock
      5. 8.5. Semaphores
      6. 8.6. Chapter Notes
      7. 8.7. Exercises
    3. 9. Linked Lists: The Role of Locking
      1. 9.1. Introduction
      2. 9.2. List-Based Sets
      3. 9.3. Concurrent Reasoning
      4. 9.4. Coarse-Grained Synchronization
      5. 9.5. Fine-Grained Synchronization
      6. 9.6. Optimistic Synchronization
      7. 9.7. Lazy Synchronization
      8. 9.8. Non-Blocking Synchronization
      9. 9.9. Discussion
      10. 9.10. Chapter Notes
      11. 9.11. Exercises
    4. 10. Concurrent Queues and the ABA Problem
      1. 10.1. Introduction
      2. 10.2. Queues
      3. 10.3. A Bounded Partial Queue
      4. 10.4. An Unbounded Total Queue
      5. 10.5. An Unbounded Lock-Free Queue
      6. 10.6. Memory Reclamation and the ABA Problem
        1. 10.6.1. A Naïve Synchronous Queue
      7. 10.7. Dual Data Structures
      8. 10.8. Chapter Notes
      9. 10.9. Exercises
    5. 11. Concurrent Stacks and Elimination
      1. 11.1. Introduction
      2. 11.2. An Unbounded Lock-Free Stack
      3. 11.3. Elimination
      4. 11.4. The Elimination Backoff Stack
        1. 11.4.1. A Lock-Free Exchanger
        2. 11.4.2. The Elimination Array
      5. 11.5. Chapter Notes
      6. 11.6. Exercises
    6. 12. Counting, Sorting, and Distributed Coordination
      1. 12.1. Introduction
      2. 12.2. Shared Counting
      3. 12.3. Software Combining
        1. 12.3.1. Overview
          1. IDLE
          2. FIRST
          3. ROOT
        2. 12.3.2. An Extended Example
        3. 12.3.3. Performance and Robustness
      4. 12.4. Quiescently Consistent Pools and Counters
      5. 12.5. Counting Networks
        1. 12.5.1. Networks That Count
        2. 12.5.2. The Bitonic Counting Network
          1. A Software Bitonic Counting Network
          2. Proof of Correctness
          3. A Periodic Counting Network
          4. A Software Periodic Counting Network
        3. 12.5.3. Performance and Pipelining
      6. 12.6. Diffracting Trees
      7. 12.7. Parallel Sorting
      8. 12.8. Sorting Networks
        1. 12.8.1. Designing a Sorting Network
          1. A Bitonic Sorting Algorithm
      9. 12.9. Sample Sorting
      10. 12.10. Distributed Coordination
      11. 12.11. Chapter Notes
      12. 12.12. Exercises
    7. 13. Concurrent Hashing and Natural Parallelism
      1. 13.1. Introduction
      2. 13.2. Closed-Address Hash Sets
        1. 13.2.1. A Coarse-Grained Hash Set
        2. 13.2.2. A Striped Hash Set
        3. 13.2.3. A Refinable Hash Set
      3. 13.3. A Lock-Free Hash Set
        1. 13.3.1. Recursive Split-Ordering
        2. 13.3.2. The BucketList Class
        3. 13.3.3. The LockFreeHashSet<T> Class
      4. 13.4. An Open-Addressed Hash Set
        1. 13.4.1. Cuckoo Hashing
        2. 13.4.2. Concurrent Cuckoo Hashing
        3. 13.4.3. Striped Concurrent Cuckoo Hashing
        4. 13.4.4. A Refinable Concurrent Cuckoo Hash Set
      5. 13.5. Chapter Notes
      6. 13.6. Exercises
    8. 14. Skiplists and Balanced Search
      1. 14.1. Introduction
      2. 14.2. Sequential Skiplists
      3. 14.3. A Lock-Based Concurrent Skiplist
        1. 14.3.1. A Bird’s-Eye View
        2. 14.3.2. The Algorithm
      4. 14.4. A Lock-Free Concurrent Skiplist
        1. 14.4.1. A Bird’s-Eye View
        2. 14.4.2. The Algorithm in Detail
      5. 14.5. Concurrent Skiplists
      6. 14.6. Chapter Notes
      7. 14.7. Exercises
    9. 15. Priority Queues
      1. 15.1. Introduction
        1. 15.1.1. Concurrent Priority Queues
      2. 15.2. An Array-Based Bounded Priority Queue
      3. 15.3. A Tree-Based Bounded Priority Queue
      4. 15.4. An Unbounded Heap-Based Priority Queue
        1. 15.4.1. A Sequential Heap
        2. 15.4.2. A Concurrent Heap
          1. Bird’s-Eye View
          2. In Detail
      5. 15.5. A Skiplist-Based Unbounded Priority Queue
      6. 15.6. Chapter Notes
      7. 15.7. Exercises
    10. 16. Futures, Scheduling, and Work Distribution
      1. 16.1. Introduction
      2. 16.2. Analyzing Parallelism
      3. 16.3. Realistic Multiprocessor Scheduling
      4. 16.4. Work Distribution
        1. 16.4.1. Work Stealing
        2. 16.4.2. Yielding and Multiprogramming
      5. 16.5. Work-Stealing Dequeues
        1. 16.5.1. A Bounded Work-Stealing Dequeue
        2. 16.5.2. An Unbounded Work-Stealing DEQueue
        3. 16.5.3. Work Balancing
      6. 16.6. Chapter Notes
      7. 16.7. Exercises
    11. 17. Barriers
      1. 17.1. Introduction
      2. 17.2. Barrier Implementations
      3. 17.3. Sense-Reversing Barrier
      4. 17.4. Combining Tree Barrier
      5. 17.5. Static Tree Barrier
      6. 17.6. Termination Detecting Barriers
      7. 17.7. Chapter Notes
      8. 17.8. Exercises
    12. 18. Transactional Memory
      1. 18.1. Introduction
        1. 18.1.1. What is Wrong with Locking?
        2. 18.1.2. What is Wrong with compareAndSet()?
        3. 18.1.3. What is Wrong with Compositionality?
        4. 18.1.4. What can We Do about It?
      2. 18.2. Transactions and Atomicity
      3. 18.3. Software Transactional Memory
        1. 18.3.1. Transactions and Transactional Threads
        2. 18.3.2. Zombies and Consistency
        3. 18.3.3. Atomic Objects
        4. 18.3.4. Dependent or Independent Progress?
        5. 18.3.5. Contention Managers
        6. 18.3.6. Implementing Atomic Objects
        7. 18.3.7. An Obstruction-Free Atomic Object
          1. Bird’s-Eye View
          2. Why It Works
          3. In Detail
        8. 18.3.8. A Lock-Based Atomic Object
          1. Bird’s-Eye View
          2. Why It Works
          3. In Detail
      4. 18.4. Hardware Transactional Memory
        1. 18.4.1. Cache Coherence
        2. 18.4.2. Transactional Cache Coherence
        3. 18.4.3. Enhancements
      5. 18.5. Chapter Notes
      6. 18.6. Exercises
  8. III. Appendix
    1. A. Software Basics
      1. A.1. Introduction
      2. A.2. Java
        1. A.2.1. Threads
        2. A.2.2. Monitors
        3. A.2.3. Yielding and Sleeping
        4. A.2.4. Thread-Local Objects
      3. A.3. C#
        1. A.3.1. Threads
        2. A.3.2. Monitors
        3. A.3.3. Thread-Local Objects
      4. A.4. Pthreads
        1. A.4.1. Thread-Local Storage
      5. A.5. Chapter Notes
    2. B. Hardware Basics
      1. B.1. Introduction (and a Puzzle)
      2. B.2. Processors and Threads
      3. B.3. Interconnect
      4. B.4. Memory
      5. B.5. Caches
        1. B.5.1. Coherence
        2. B.5.2. Spinning
      6. B.6. Cache-Conscious Programming, or the Puzzle Solved
      7. B.7. Multi-Core and Multi-Threaded Architectures
        1. B.7.1. Relaxed Memory Consistency
      8. B.8. Hardware Synchronization Instructions
      9. B.9. Chapter Notes
      10. B.10. Exercises
    3. Bibliography