O'Reilly logo

CUDA Application Design and Development by Rob Farber

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 10. CUDA in a Cloud and Cluster Environments
Distributed GPUs are powering the fastest supercomputers as CUDA scales to thousands of nodes using MPI, the de facto standard for distributed computing. NVIDIA's GPUDirect has accelerated the key operations in MPI, send (MPI_Send) and receive (MPI_Recv), for communication over the network instead of the PCIe bus. As the name implies, GPUDirect moves data between GPUs without involving the processor on any of the host systems. This chapter focuses on the use of MPI and GPUDirect so that CUDA programmers can incorporate these APIs to create applications for cloud computing and computational clusters. The performance benefits of distributed GPU computing are very real but dependent on the bandwidth ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required