Case study – Histogram calculation

In the section Histogram calculation in Chapter 3, Buffers and Image Objects, we discussed about the naive implementation of histogram computation of an image. We read an input image file and pass the pixel buffer to the OpenCL device to compute the histogram of the image. By now you must have observed that this implementation is not so optimized which involves sequential reads. In this section we will try to optimize this implementation by making use of atomic_inc OpenCL C built-in and make use of coalesced reads and writes to the global and local memory. Take a look at the following kernel:

#define BIN_SIZE 256 #define ELEMENTS_TO_PROCESS 256 __kernel void histogram_kernel(__global const uint* data, __global ...

Get OpenCL Programming by Example now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.