Chapter 4

Optimizing for Reacting Navier-Stokes Equations

Antonio Valles*; Weiqun Zhang    * Intel, USA Lawrence Berkeley National Laboratory, USA

Abstract

The optimizations discussed in this chapter significantly improved concurrency on both Intel Xeon Phi coprocessors and Intel Xeon processors. OpenMP scaling of 240 threads vs. one thread is now 100x, was 38x in first version for coprocessors. Similarly, processor scaling improved to 16x from 10x. The chapter discusses source modifications to transform fine-grain thread parallel approach to be more coarse-grain, memory allocation considerations on Intel Xeon Phi coprocessors, and source transformations to improve vectorization. In addition, this chapter briefly demonstrates how new features ...

Get High Performance Parallelism Pearls Volume One now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.