Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo
CUDA for Engineers: An Introduction to High-Performance Parallel Computing

Book Description

CUDA for Engineers gives you direct, hands-on engagement with personal, high-performance parallel computing, enabling you to do computations on a gaming-level PC that would have required a supercomputer just a few years ago.

The authors introduce the essentials of CUDA C programming clearly and concisely, quickly guiding you from running sample programs to building your own code. Throughout, you’ll learn from complete examples you can build, run, and modify, complemented by additional projects that deepen your understanding. All projects are fully developed, with detailed building instructions for all major platforms.

Ideal for any scientist, engineer, or student with at least introductory programming experience, this guide assumes no specialized background in GPU-based or parallel computing. In an appendix, the authors also present a refresher on C programming for those who need it.

Coverage includes

  • Preparing your computer to run CUDA programs

  • Understanding CUDA’s parallelism model and C extensions

  • Transferring data between CPU and GPU

  • Managing timing, profiling, error handling, and debugging

  • Creating 2D grids

  • Interoperating with OpenGL to provide real-time user interactivity

  • Performing basic simulations with differential equations

  • Using stencils to manage related computations across threads

  • Exploiting CUDA’s shared memory capability to enhance performance

  • Interacting with 3D data: slicing, volume rendering, and ray casting

  • Using CUDA libraries

  • Finding more CUDA resources and code

  • Realistic example applications include

  • Visualizing functions in 2D and 3D

  • Solving differential equations while changing initial or boundary conditions

  • Viewing/processing images or image stacks

  • Computing inner products and centroids

  • Solving systems of linear algebraic equations

  • Monte-Carlo computations

  • Table of Contents

    1. About This E-Book
    2. Title Page
    3. Copyright Page
    4. Praise for CUDA for Engineers
    5. Dedication Page
    6. Contents
    7. Acknowledgments
    8. About the Authors
    9. Introduction
      1. What Is CUDA?
      2. What Does “Need-to-Know” Mean for Learning CUDA?
      3. What Is Meant by “for Engineers”?
      4. What Do You Need to Get Started with CUDA?
      5. How Is This Book Structured?
      6. Conventions Used in This Book
      7. Code Used in This Book
      8. User’s Guide
      9. Historical Context
      10. References
    10. Chapter 1. First Steps
      1. Running CUDA Samples
        1. CUDA Samples Under Windows
        2. CUDA Samples Under Linux
        3. Estimating “Acceleration”
      2. Running Our Own Serial Apps
        1. dist_v1
        2. dist_v2
      3. Summary
      4. Suggested Projects
    11. Chapter 2. CUDA Essentials
      1. CUDA’s Model for Parallelism
      2. Need-to-Know CUDA API and C Language Extensions
      3. Summary
      4. Suggested Projects
      5. References
    12. Chapter 3. From Loops to Grids
      1. Parallelizing dist_v1
        1. Executing dist_v1_cuda
      2. Parallelizing dist_v2
      3. Standard Workflow
      4. Simplified Workflow
        1. Unified Memory and Managed Arrays
        2. Distance App with cudaMallocManaged()
      5. Summary
      6. Suggested Projects
      7. References
    13. Chapter 4. 2D Grids and Interactive Graphics
      1. Launching 2D Computational Grids
        1. Syntax for 2D Kernel Launch
        2. Defining 2D Kernels
        3. dist_2d
      2. Live Display via Graphics Interop
      3. Application: Stability
        1. Running the Stability Visualizer
      4. Summary
      5. Suggested Projects
      6. References
    14. Chapter 5. Stencils and Shared Memory
      1. Thread Interdependence
      2. Computing Derivatives on a 1D Grid
        1. Implementing dd_1d_global
        2. Implementing dd_1d_shared
        3. Solving Laplace’s Equation in 2D: heat_2d
        4. Sharpening Edges in an Image: sharpen
      3. Summary
      4. Suggested Projects
      5. References
    15. Chapter 6. Reduction and Atomic Functions
      1. Threads Interacting Globally
      2. Implementing parallel_dot
      3. Computing Integral Properties: centroid_2d
      4. Summary
      5. Suggested Projects
      6. References
    16. Chapter 7. Interacting with 3D Data
      1. Launching 3D Computational Grids: dist_3d
      2. Viewing and Interacting with 3D Data: vis_3d
        1. Slicing
        2. Volume Rendering
        3. Raycasting
        4. Creating the vis_3d App
      3. Summary
      4. Suggested Projects
      5. References
    17. Chapter 8. Using CUDA Libraries
      1. Custom versus Off-the-Shelf
      2. Thrust
        1. Computing Norms with inner_product()
        2. Computing Distances with transform()
        3. Estimating Pi with generate(), transform(), and reduce()
      3. cuRAND
      4. NPP
        1. sharpen_npp
        2. More Image Processing with NPP
      5. Linear Algebra Using cuSOLVER and cuBLAS
      6. cuDNN
      7. ArrayFire
      8. Summary
      9. Suggested Projects
      10. References
    18. Chapter 9. Exploring the CUDA Ecosystem
      1. The Go-To List of Primary Sources
        1. CUDA Zone
        2. Other Primary Web Sources
        3. Online Courses
        4. CUDA Books
      2. Further Sources
        1. CUDA Samples
        2. CUDA Languages and Libraries
        3. More CUDA Books
      3. Summary
      4. Suggested Projects
    19. Appendix A. Hardware Setup
      1. Checking for an NVIDIA GPU: Windows
      2. Checking for an NVIDIA GPU: OS X
      3. Checking for an NVIDIA GPU: Linux
      4. Determining Compute Capability
      5. Upgrading Compute Capability
        1. Mac or Notebook Computer with a CUDA-Enabled GPU
        2. Desktop Computer
    20. Appendix B. Software Setup
      1. Windows Setup
        1. Creating a Restore Point
        2. Installing the IDE
        3. Installing the CUDA Toolkit
        4. Initial Test Run
      2. OS X Setup
        1. Downloading and Installing the CUDA Toolkit
      3. Linux Setup
        1. Preparing the System Software for CUDA Installation
        2. Downloading and Installing the CUDA Toolkit
        3. Installing Samples to the User Directory
        4. Initial Test Run
    21. Appendix C. Need-to-Know C Programming
      1. Characterization of C
      2. C Language Basics
      3. Data Types, Declarations, and Assignments
      4. Defining Functions
      5. Building Apps: Create, Compile, Run, Debug
        1. Building Apps in Windows
        2. Building Apps in Linux
      6. Arrays, Memory Allocation, and Pointers
      7. Control Statements: for, if
        1. The for Loop
        2. The if Statement
        3. Other Control Statements
      8. Sample C Programs
        1. dist_v1
        2. dist_v2
        3. dist_v2 with Dynamic Memory
      9. References
    22. Appendix D. CUDA Practicalities: Timing, Profiling, Error Handling, and Debugging
      1. Execution Timing and Profiling
        1. Standard C Timing Methods
        2. CUDA Events
        3. Profiling with NVIDIA Visual Profiler
        4. Profiling in Nsight Visual Studio
      2. Error Handling
      3. Debugging in Windows
      4. Debugging in Linux
      6. Using Visual Studio Property Pages
      7. References
    23. Index
    24. Code Snippets