You are previewing Multicore Software Development Techniques.
O'Reilly logo
Multicore Software Development Techniques

Book Description

This book provides a set of practical processes and techniques used for multicore software development.  It is written with a focus on solving day to day problems using practical tips and tricks and industry case studies to reinforce the key concepts in multicore software development.

Coverage includes:

  • The multicore landscape
  • Principles of parallel computing
  • Multicore SoC architectures
  • Multicore programming models
  • The Multicore development process
  • Multicore programming with threads
  • Concurrency abstraction layers
  • Debugging Multicore Systems
  • Practical techniques for getting started in multicore development
  • Case Studies in Multicore Systems Development
  • Sample code to reinforce many of the concepts discussed


  • Presents the ‘nuts and bolts’ of programming a multicore system
  • Provides a short-format book on the practical processes and techniques used in multicore software development
  • Covers practical tips, tricks and industry case studies to enhance the learning process

Table of Contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Dedication
  6. Chapter 1. Principles of Parallel Computing
    1. Abstract
    2. 1.1 Concurrency versus Parallelism
    3. 1.2 Symmetric and Asymmetric Multiprocessing
    4. 1.3 Parallelism Saves Power
    5. 1.4 Key Challenges of Parallel Computing
  7. Chapter 2. Parallelism in All of Its Forms
    1. Abstract
    2. 2.1 Bit-Level Parallelism
    3. 2.2 Instruction-Level Parallelism (ILP)
    4. 2.3 Simultaneous Multithreading
    5. 2.4 Single Instruction, Multiple Data (SIMD)
    6. 2.5 Data Parallelism
    7. 2.6 Task Parallelism
    8. 2.7 Acceleration and Offload Engines
  8. Chapter 3. Multicore System Architectures
    1. Abstract
    2. 3.1 Shared Memory Multicore Systems
    3. 3.2 Cache Coherency
    4. 3.3 Shared Data Synchronization
    5. 3.4 Distributed Memory
    6. 3.5 Symmetric Multiprocessing
    7. 3.6 Asymmetric Multiprocessing
    8. 3.7 Hybrid Approaches
    9. 3.8 Speaking of Cores
    10. 3.9 Graphical Processing Units (GPU)
    11. 3.10 Putting It All Together
  9. Chapter 4. Multicore Software Architectures
    1. Abstract
    2. 4.1 Multicore Software Architectures
    3. 4.2 A Decision Tree Approach to Selecting a Multicore Architecture
  10. Chapter 5. Multicore Software Development Process
    1. Abstract
    2. 5.1 Multicore Programming Models
  11. Chapter 6. Putting it All Together, A Case Study of Multicore Development
    1. Abstract
    2. 6.1 Multiple-Single-Cores
    3. 6.2 Cooperating-Multiple-Cores
    4. 6.3 Getting Started
    5. 6.4 System Requirements
  12. Chapter 7. Multicore Virtualization
    1. Abstract
    2. 7.1 Hypervisor Classifications
    3. 7.2 Virtualization Use Cases for Multicore
    4. 7.3 Linux Hypervisors
    5. 7.4 Virtual Networking in Multicore
    6. 7.5 I/O Activity in a Virtualized Environment
    7. 7.6 Direct Device Assignment
  13. Chapter 8. Performance and Optimization of Multicore Systems
    1. Abstract
    2. 8.1 Select the Right “Core” for Your Multicore
    3. 8.2 Improve Serial Performance before Migrating to Multicore (Especially ILP)
    4. 8.3 Achieve Proper Load Balancing (SMP Linux) and Scheduling
    5. 8.4 Improve Data Locality
    6. 8.5 Reduce or Eliminate False Sharing
    7. 8.6 Use Affinity Scheduling When Necessary
    8. 8.7 Apply the Proper Lock Granularity and Frequency
    9. 8.8 Remove Sync Barriers Where Possible
    10. 8.9 Minimize Communication Latencies
    11. 8.10 Use Thread Pools
    12. 8.11 Manage Thread Count
    13. 8.12 Stay Out of the Kernel If at all Possible
    14. 8.13 Use Parallel Libraries (pthreads, OpenMP, etc.)
  14. Chapter 9. Sequential to Parallel Migration of Software Applications
    1. Abstract
    2. 9.1 Step 1: Understand Requirements
    3. 9.2 Step 2: Sequential Analysis
    4. 9.3 Step 3: Exploration
    5. 9.4 Step 4: Code Optimization and Tuning
    6. 9.5 Image Processing Example
    7. 9.6 Step 2: Sequential Analysis
    8. 9.7 Step 3: Exploration
    9. 9.8 Step 4: Optimization and Tuning
    10. 9.9 Data Parallel; First Attempt
    11. 9.10 Data Parallel—Second Try
    12. 9.11 Task Parallel—Third Try
    13. 9.12 Exploration Results
    14. 9.13 Tuning
    15. 9.14 Data Parallel—Third Try
    16. 9.15 Data Parallel—Third Results
    17. 9.16 Data Parallel—Fourth Try
    18. 9.17 Data Parallel—Work Queues
    19. 9.18 Going Too Far?
  15. Chapter 10. Concurrency Abstractions
    1. Abstract
    2. 10.1 Language Extensions Example—OpenMP
    3. 10.2 Framework Example—OpenCL
    4. 10.3 Libraries Example—Thread Building Libraries
    5. 10.4 Thread Safety
    6. 10.5 Message Passing Multicore Models—MPI and MCAPI
    7. 10.6 Language Support
    8. Additional Reading
  16. Appendix A. Source Code Examples
    1. Matrix Multiply – Naïve Version (Not Cache Friendly)
    2. Matrix Multiply—Cache Friendly Version
    3. Primes Code with Race Conditions
    4. Primes Code with Race Conditions FIXED
    5. Conway’s Game of Life Unoptimized
    6. Conway’s Game of Life Optimized
  17. Index