O'Reilly logo

Professional CUDA C Programming by Ty McKercher, Max Grossman, John Cheng

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 9Multi-GPU Programming

What's in this chapter?

  • Managing multiple GPUs
  • Executing kernels across multiple GPUs
  • Overlapping computation and communication between GPUs
  • Synchronizing across GPUs
  • Exchanging data using CUDA-aware MPI
  • Exchanging data using CUDA-aware MPI with GPUDirect RDMA
  • Scaling applications across a GPU-accelerated cluster
  • Understanding CPU and GPU affinity

So far, most of the examples in this book have used a single GPU. In this chapter, you will gain experience in multi-GPU programming: scaling your application across multiple GPUs within a compute node, or across multiple GPU-accelerated nodes. CUDA provides a number of features to facilitate multi-GPU programming, including multi-device management from one or more processes, direct access to other devices' memory using Unified Virtual Addressing (UVA) and GPUDirect, and computation-communication overlap across multiple devices using streams and asynchronous functions. In this chapter, you will learn the necessary skills to:

  • Manage and execute kernels on multiple GPUs.
  • Overlap computation and communication across multiple GPUs.
  • Synchronize execution across multiple GPUs using streams and events.
  • Scale CUDA-aware MPI applications across a GPU-accelerated ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required