Chapter 12. Reduction

Reduction is a class of parallel algorithms that pass over O(N) input data and generate a O(1) result computed with a binary associative operator Image. Examples of such operations include minimum, maximum, sum, sum of squares, AND, OR, and the dot product of two vectors. Reduction is also an important primitive used as a subroutine in other operations, such as Scan (covered in the next chapter).

Unless the operator Image is extremely expensive to evaluate, reduction tends to be bandwidth-bound. Our treatment of reduction begins with ...

Get The CUDA Handbook: A Comprehensive Guide to GPU Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.