Implementing the transform-reduction algorithm on the GPU

When implementing the actual transformation, we need to copy the data back and forth. The data structures housed at the GPU are prefixed with gpu_, and data structures housed at the CPU are prefixed with cpu_.

Note that Boost Compute has been nice enough to provide a compute::plus<float> functor equivalent of std::plus, which we use when the areas are reduced:

namespace bc = boost::compute; 
auto circle_areas_gpu(bc::context& context, bc::command_queue& q) { 
  // Create a bunch of random circles and copy to the GPU  const auto n = 1024; 
  auto cpu_circles = make_circles(n); 
  auto gpu_circles = bc::vector<Circle>(n, context); bc::copy(cpu_circles.begin(), cpu_circles.end(), gpu_circles.begin(), ...

Get C++ High Performance now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.