Chapter 1, Why GPU Programming?

  1. The first two for loops iterate over every pixel, whose outputs are invariant to each other; we can thus parallelize over these two for loops. The third for loop calculates the final value of a particular pixel, which is intrinsically recursive.
  2. Amdahl's Law doesn't account for the time it takes to transfer memory between the GPU and the host.
  3. 512 x 512 amounts to 262,144 pixels. This means that the first GPU can only calculate the outputs of half of the pixels at once, while the second GPU can calculate all of the pixels at once; this means the second GPU will be about twice as fast as the first here. The third GPU has more than sufficient cores to calculate all pixels at once, but as we saw in problem 1, ...

Get Hands-On GPU Programming with Python and CUDA now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.