You are previewing High-Performance Computing and Concurrency.
O'Reilly logo
High-Performance Computing and Concurrency

Video Description

It's deja vu all over again. In the old days (35 plus years ago), developers wanting to write half-decent programs had to know their hardware. Those days are back. Clock frequencies have peaked and hardware can no longer be abstracted behind high-level languages. Designed for developers with high performance requirements (games, finance analysis, scientific computation, etc.), this course teaches you what really happens when programs are executed and the subtle details that make a program go slow or fast.

With a focus on concurrency, specifically local concurrency (multi-threading), the course is all about writing efficient programs that make the best use of the computing resources available to you. While the sample code is written in C++, the course is not C++ specific. If you can read C++ code, but don’t use it in your work, you will still learn from this class.

  • Learn how programs execute in hardware and the subtle details that affect program speed
  • Practice writing efficient programs that get the most out of today’s CPUs, caches, and memory
  • Discover how single and multi-core CPUs interact with memory and how to avoid memory slowness
  • Explore memory models, concurrent data structures, lock-free concurrency, and lock-based concurrency
  • Acquire the tools needed to measure the performance of programs and their components
Fedor G. Pikus is a chief engineering scientist in the Design-to-Silicon division of Mentor Graphics and a former senior software engineer at Google. Fedor builds the design automation tools used by the people who build the chips in your computers, cars, and more. He has over 25 patents, and over 90 papers and conference presentations on physics, EDA, software design, and the C++ language. He holds a Ph.D. in Applied Physics from Peter the Great St. Petersburg Polytechnic University.

Table of Contents

  1. Introduction
    1. Introduction And Course Overview 00:12:13
    2. About The Author 00:04:17
    3. How To Access Your Working Files 00:01:15
  2. Memory Architecture And Performance Impact
    1. Overview 00:12:32
    2. Overview (Continued) 00:11:36
    3. Access Patterns And Impact On Algorithms And Data Structure Design 00:14:00
    4. Many Threads (Multi-Core Access) 00:12:45
  3. Measuring Time In Programs
    1. Real Time And CPU Time 00:13:34
    2. TSC Timers 00:09:25
    3. Profiling Tools 00:09:08
  4. Threads
    1. Overview 00:07:37
    2. Threads In C++ 00:14:39
    3. Avoiding Data Races And Its Cost 00:11:54
  5. How Threads Interact With Memory
    1. Concurrency And Memory 00:14:17
    2. Data Sharing 00:06:51
    3. False Data Sharing 00:15:41
  6. Synchronization Of Memory Accesses
    1. Locks (Mutexes) Part - 1 00:12:04
    2. Locks (Mutexes) Part - 2 00:12:03
    3. Locks (Spinlocks) 00:17:12
    4. Lock-Free Synchronization And Other Options - Part 1 00:11:15
    5. Lock-Free Synchronization And Other Options - Part 2 00:10:27
  7. Memory Models
    1. Memory Model 00:10:01
    2. C++ Memory Model 00:06:47
    3. Memory Order 00:07:36
    4. Memory Order Guarantees In C++ 00:06:30
  8. Memory Barriers
    1. Need For Memory Barriers 00:18:06
    2. Memory Barriers 00:10:07
    3. Synchronization, Revisited - Part 1 00:07:42
    4. Synchronization, Revisited - Part 2 00:19:19
  9. Lock-Based And Lock-Free Programming
    1. Efficient Concurrency; Types Of Concurrent Programs 00:12:01
    2. Problems With Locks Part - 1 00:10:13
    3. Problems With Locks Part - 2 00:06:54
    4. Thread-Safe Data Structures Part - 1 00:14:56
    5. Thread-Safe Data Structures Part - 2 00:06:55
    6. Introduction To Lock-Free Programming 00:17:49
  10. Lock-Free Data Structures
    1. Shared Pointer Part - 1 00:10:49
    2. Shared Pointer Part - 2 00:09:02
    3. Shared Pointer Part - 3 00:10:34
    4. Shared Pointer Part - 4 00:10:26
    5. Shared Pointer Part - 5 00:09:51
    6. Shared Pointer Part - 6 00:15:42
    7. Node-Based Containers Part - 1 (List) 00:10:40
    8. Node-Based Containers Part - 2 (List) 00:08:27
    9. Node-Based Containers Part - 3 (List) 00:11:23
    10. Node-Based Containers Part - 4 (List) 00:11:16
    11. Node-Based Containers Part - 5 (List) 00:09:44
    12. Node-Based Containers Part - 6 (List) 00:06:33
    13. Node-Based Containers Part - 7 (List) 00:07:56
    14. Sequential Containers Part - 1 (Queue) 00:10:09
    15. Sequential Containers Part - 2 (Queue) 00:10:09
    16. Sequential Containers Part - 3 (Queue) 00:13:20
    17. Sequential Containers Part - 4 (Queue) 00:09:37
    18. Sequential Containers Part - 5 (Queue) 00:09:55
    19. Sequential Containers Part - 6 (Queue) 00:13:36
  11. Performance In Real Life
    1. Practical Performance 00:16:03
    2. Factors Affecting Performance Part -1 00:09:13
    3. Factors Affecting Performance Part -2 00:12:30
  12. Concurrent Data Structures In Depth
    1. Concurrency, Performance, And Order Guarantees Part - 1 00:08:19
    2. Concurrency, Performance, And Order Guarantees Part - 2 00:09:15
    3. Toward More General Data Structures 00:09:54
  13. Conclusion
    1. Conclusions And Where To Go From Here 00:14:14