Processing large NumPy arrays with memory mapping

Sometimes, we need to deal with NumPy arrays that are too big to fit in the system memory. A common solution is to use memory mapping and implement out-of-core computations. The array is stored in a file on the hard drive, and we create a memory-mapped object to this file that can be used as a regular NumPy array. Accessing a portion of the array results in the corresponding data being automatically fetched from the hard drive. Therefore, we only consume what we use.

How to do it...

  1. Let's create a memory-mapped array in write mode:
    >>> import numpy as np
    >>> nrows, ncols = 1000000, 100
    >>> f = np.memmap('memmapped.dat', dtype=np.float32,
                      mode='w+', shape=(nrows, ncols))
  2. Let's feed the array with random ...

Get IPython Interactive Computing and Visualization Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.