O'Reilly logo

CUDA Application Design and Development by Rob Farber

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6. Efficiently Using GPU Memory
The importance of efficiently using GPU memory cannot be overstated. With roughly three-orders-of-magnitude difference in speed between the fastest on-chip register memory and mapped data that must traverse the PCIe bus, literate CUDA developers must understand the most efficient ways to use memory. Latency hiding through ILP or TLP is essential to application performance. Prefetching can keep more memory transactions in flight to move data to fast memory and speed even memory bandwidth-limited reduction operations. Irregular data structures are a challenge with current GPU technology, but some techniques can preserve performance even with random memory accesses. However, finding more and better ways to ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required