Chapter 7

Data Management

When programming with OpenCL for a discrete GPU, the overhead of transferring data between the host and the device needs to be considered carefully since this communication can dominate overall program execution time. Data transfer time can rival the performance benefits afforded by data-parallel execution on GPUs, and it is not uncommon for data transfer to be on the same order of time as kernel execution. As we move to shared-memory CPU–GPU systems (APUs), the performance issues involved with proper data management and communication are equally critical. This chapter introduces many of the key concepts and presents details required to understand data transfers and data accesses within discrete and shared-memory heterogeneous ...

Get Heterogeneous Computing with OpenCL, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.