Using printf from within CUDA kernels

It may come as a surprise, but we can actually print text to the standard output from directly within a CUDA kernel; not only that, each individual thread can print its own output. This will come in particularly handy when we are debugging our kernels, as we may need to monitor the values of particular variables or computations at particular points in our code and it will also free us from the shackles of using a debugger to go through step by step. Printing output from a CUDA kernel is done with none other than the most fundamental function in all of C/C++ programming, the function that most people will learn when they write their first Hello world program in C: printf. Of course, printf is the standard ...

Get Hands-On GPU Programming with Python and CUDA now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.