Filling in the gaps with CUDA-C

We will now go through the very basics of how to write a full-on CUDA-C program. We'll start small and just translate the fixed version of the little matrix multiplication test program we just debugged in the last section to a pure CUDA-C program, which we will then compile from the command line with NVIDIA's nvcc compiler into a native Windows or Linux executable file (we will see how to use the Nsight IDE in the next section, so we will just be doing this with only a text editor and the command line for now). Again, the reader is encouraged to look at the code we are translating from Python as we go along, which is available as the matrix_ker.py file in the repository.

Now, let's open our favorite text editor ...

Get Hands-On GPU Programming with Python and CUDA now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.