Accelerators such as a typical GPU have a relatively small programmable cache that can be accessed far more quickly than the global accelerator memory used for an array or array_view. How much more quickly? On the order of a hundred times faster! Of course, there’s more to your algorithm than accessing memory, but taking advantage of that fast memory can produce a significant gain that can have a big impact on your application’s overall performance. It’s a bit com...


should use fast memory