O'Reilly logo

High Performance Parallelism Pearls Volume Two by James Reinders, Jim Jeffers

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 19

OpenCL

There and Back Again

Matthias Noack*; Florian Wende*; Klaus-Dieter Oertel    * Zuse Institute Berlin, Germany Intel Corporation, Germany

Abstract

The chapter presents a case study on optimizing the Hexciton kernel of the GPU-HEOM code for parallelism. The HEOM method bridges biology and quantum physics to simulate molecular light-harvesting complexes. The Hexciton kernel computes a commutator term for a large set of small, complex matrices, which is relevant in other domains too. Starting with a naive reference implementation, the chapter develops a fully optimized OpenCL kernel by analyzing different techniques. The chapter compares automatic and manual vectorization techniques to optimize the memory layout for contiguous ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required