Kernel invocations with GPUArray
In the previous recipe, we saw how to invoke a kernel function using the class:
pycuda.compiler.SourceModule(kernel_source, nvcc="nvcc", options=None, other_options)
It creates a module from the CUDA source code called kernel_source
. Then, the NVIDIA nvcc compiler is invoked with options to compile the code.
However, PyCUDA introduces the class pycuda.gpuarray.GPUArray
that provides a high-level interface to perform calculations with CUDA:
class pycuda.gpuarray.GPUArray(shape, dtype, *, allocator=None, order="C")
This works in a similar way to numpy.ndarray
, which stores its data and performs its computations on the compute device. The shape
and dtype
arguments work exactly as in NumPy.
All the arithmetic methods in ...
Get Python Parallel Programming Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.