O'Reilly logo

OpenCL Programming by Example by Koushik Bhattacharyya, Ravishekhar Banger

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Case study – Histogram calculation

In the section Histogram calculation in Chapter 3, Buffers and Image Objects, we discussed about the naive implementation of histogram computation of an image. We read an input image file and pass the pixel buffer to the OpenCL device to compute the histogram of the image. By now you must have observed that this implementation is not so optimized which involves sequential reads. In this section we will try to optimize this implementation by making use of atomic_inc OpenCL C built-in and make use of coalesced reads and writes to the global and local memory. Take a look at the following kernel:

#define BIN_SIZE 256 #define ELEMENTS_TO_PROCESS 256 __kernel void histogram_kernel(__global const uint* data, __global ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required