WHAT'S IN THIS CHAPTER?
Understanding SIMD and vectorization
Understanding extended instruction sets
Working with Intel Math Kernel Library
Working with multicore-ready, highly optimized software functions
Mixing task-based programming with external optimized libraries
Generating pseudo-random numbers in parallel
Working with the
Using Intel Integrated Performance Primitives
In the previous 10 chapters, you learned to create and coordinate code that runs many tasks in parallel to improve performance. If you want to improve throughput even further, you can take advantage of other possibilities offered by modern hardware related to parallelism. This chapter is about the usage of additional performance libraries and includes examples of their integration with .NET Framework 4 and the new task-based programming model. In addition, the chapter provides examples of the usage of the new thread-local storage classes and the lazy-initialization capabilities provided by these classes.
The "Parallel Programming and Multicore Programming" section of Chapter 1, "Task-Based Programming," introduced the different kinds of parallel architectures. This section also explained that most modern microprocessors can execute Single Instruction, Multiple Data (SIMD) instructions. Because the execution units for SIMD instructions usually belong to a physical core, it is possible ...