O'Reilly logo
  • Shaojun Fu thinks this is interesting:

Accelerators such as a typical GPU have a relatively small programmable cache that can be accessed far more quickly than the global accelerator memory used for an array or array_view. How much more quickly? On the order of a hundred times faster! Of course, there’s more to your algorithm than accessing memory, but taking advantage of that fast memory can produce a significant gain that can have a big impact on your application’s overall performance. It’s a bit com...


Cover of C++ AMP: Accelerated Massive Parallelism with Microsoft® Visual C++®


should use fast memory