Kernel invocations with GPUArray

In the previous recipe, we saw how to invoke a kernel function using the class:

pycuda.compiler.SourceModule(kernel_source, nvcc="nvcc", options=None, other_options)

It creates a module from the CUDA source code called kernel_source. Then, the NVIDIA nvcc compiler is invoked with options to compile the code.

However, PyCUDA introduces the class pycuda.gpuarray.GPUArray that provides a high-level interface to perform calculations with CUDA:

class pycuda.gpuarray.GPUArray(shape, dtype, *, allocator=None, order="C")

This works in a similar way to numpy.ndarray, which stores its data and performs its computations on the compute device. The shape and dtype arguments work exactly as in NumPy.

All the arithmetic methods in ...

Get Python Parallel Programming Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.