Chapter 3

Introduction to Data Parallelism and CUDA C

Chapter Outline

3.1 Data Parallelism

3.2 CUDA Program Structure

3.3 A Vector Addition Kernel

3.4 Device Global Memory and Data Transfer

3.5 Kernel Functions and Threading

3.6 Summary

3.7 Exercises

References

Our main objective is to teach the key concepts involved in writing massively parallel programs in a heterogeneous computing system. This requires many code examples expressed in a reasonably simple language that supports massive parallelism and heterogeneous computing. We have chosen CUDA C for our code examples and exercises. CUDA C is an extension to the popular C programming language1 with new keywords and application programming interfaces for programmers to take advantage of heterogeneous ...

Get Programming Massively Parallel Processors, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.