Time for action – sampling with numpy.random.choice()

We will use the numpy.random.choice() function to perform bootstrapping.

  1. Start the IPython or Python shell and import NumPy:
    $ ipython
    In [1]: import numpy as np
    
  2. Generate a data sample following the normal distribution:
    In [2]: N = 500
    
    In [3]: np.random.seed(52)
    
    In [4]: data = np.random.normal(size=N)
    
  3. Calculate the mean of the data:
    In [5]: data.mean()
    Out[5]: 0.07253250605445645
    

    Generate 100 samples from the original data and calculate their means (of course, more samples may lead to a more accurate result):

    In [6]: bootstrapped = np.random.choice(data, size=(N, 100))
    
    In [7]: means = bootstrapped.mean(axis=0)
    
    In [8]: means.shape
    Out[8]: (100,)
    
  4. Calculate the mean, variance, and standard deviation ...

Get NumPy : Beginner's Guide - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.