Chapter 36. Stream Reduction Operations for GPGPU Applications

Daniel HornStanford University

Many GPGPU-based applications rely on the fragment processor, which operates across a large set of output memory locations, consuming a fixed number of input elements per location and operating a small program on those elements to produce a single output element in that location. Because the fragment program must write its results to a preordained memory location, it is not able to vary the amount of data that it outputs according to the input data it processes. (See Chapter 30 in this book, “The GeForce 6 Series GPU Architecture,” for more details on the capabilities and limits of the fragment processor.)

Many algorithms are difficult to implement under ...

Get GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.