Unit 47Doing Stats the Python Way

Python support for random numbers and statistics is scattered across several modules: statistics, numpy.random, pandas, and scipy.stats.

Generating Random Numbers

The numpy.random module has random number generators for all major probability distributions.

Early in this book (here), you learned that data analysis code should be reproducible: anyone should be able to run the same program with the same input data and get the same results. You should always initialize the pseudo random seed with the seed function. Otherwise, the generators produce different pseudo random sequences with every program run, which may make the results hard or impossible to reproduce.

 import​ numpy.random ​as​ rnd
 rnd.seed(z)

Get Data Science Essentials in Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.