Applying Software-Managed Caching and CPU/GPU Task Scheduling for Accelerating Dynamic Workloads
Mark Silberstein, Assaf Schuster and John D. Owens
In this chapter we cover two difficult problems frequently encountered by GPU developers: optimizing memory access for kernels with complex input-dependent access patterns, and mapping the computations to a GPU or a CPU in composite applications with multiple dependent kernels. Both pose a formidable challenge as they require dynamic adaptation and tuning of execution policies to allow high performance for a wide range of inputs. Not meeting these requirements leads to substantial performance penalty.
We first describe our methodology for solving the memory optimization problem via software-managed ...