Chapter 9

OpenCL Case StudyHistogram

Introduction

This chapter discusses specific optimizations for a memory-bound kernel. The kernel we choose for this chapter is an image histogram operation. The source data for this operation is an 8-bit-per-pixel image with a target of counting into each of 256 32-bit histogram bins.

The principle of the histogram algorithm is to perform the following operation over each element of the image:

 for(many input values) {

   histogram[value]++;

 }

This algorithm performs many scattered read-modify-write accesses into a small histogram data structure. On a CPU, this application will use the cache, although with a high rate of reuse of elements. On a GPU, these accesses will be resolved in global memory, which will ...

Get Heterogeneous Computing with OpenCL, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.