O'Reilly logo

Scalable Computing and Communications: Theory and Practice by Lizhe Wang, Albert Y. Zomaya, Samee U. Khan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

27

–––––––––––––––––––––––

Increasing Performance through Optimization on APU

Matthew Doerksen, Parimala Thulasiraman, and Ruppa Thulasiram

27.1   INTRODUCTION

As we move into the exascale era of computing, heterogeneous architectures have become an integral component of high-performance systems (HPSs) and high-performance computing (HPC). Over time, we have transitioned from homogeneous central processing unit (CPU)-centric HPSs such as Jaguar [1] to heterogeneous HPSs such as Roadrunner [2], which uses a modified Cell processor and the graphics processing unit (GPU)-based Tianhe-1A [3]. The use of these HPSs has been vital for research applications but, until recently, has not been a factor in the consumer-level experience. However, with new technologies such as AMD’s accelerated processing unit (APU) architecture, which fuses the CPU and the GPU onto a single chip, consumers now have an affordable HPS at their disposal.

27.2   HETEROGENEOUS ARCHITECTURES

To begin, we will provide a basic overview of the different types of heterogeneous architectures currently available. A short list includes the Cell Broadband Engine (Cell BE) [4], GPUs from AMD [5] and NVIDIA, and lastly, AMD’s Fusion APU [6]. Each of these architectures has its own advantages and disadvantages that, in part, determine how well it will perform in a particular situation or algorithm.

image

FIGURE 27.1.    

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required