Understanding statistical inference with Python
A computational approach to estimation and hypothesis testing
Do you know the difference between standard deviation and standard error? What about the importance of pvalues or confidence intervals? Most people don’t really understand these concepts even after taking several statistics classes. The problem is that these courses focus on mathematical methods, burying the concepts under a mountain of details.
Join expert Allen Downey for a computational approach to statistical inference that uses random simulations instead of mathematical equations. Drawing on his book Think Stats, his courses at Olin College, and his blog, Probably Overthinking It, Allen walks you through using Python to implement simple statistical experiments and shares examples using realworld data to answer the three fundamental questions of statistical inference: how to use data to estimate the size of whatever effect you observe, how to quantify the precision of that estimate, and how to decide whether the apparent effect might be due to chance.
What you'll learnand how you can apply it
By the end of this live, online course, you’ll understand:
 The goals of statistical inference: estimating the size of an effect, quantifying the precision of the estimate, and testing hypotheses
 The limitations and hazards of statistical inference, including sampling bias, measurement error, and some causes of falsepositive hypothesis tests
And you’ll be able to:
 Use computational tools to compute effect sizes, confidence intervals, standard errors, and pvalues
 Choose appropriate statistics to measure effect size and test hypotheses
 Communicate statistical results to both technical and nontechnical audiences
This training course is for you because...
 You're a scientist designing experiments, interpreting data, and presenting results.
 You're an engineer developing statistical analysis pipelines that turn data into actionable information.
 You're a data scientist with a only vague memory of past statistics classes who needs to explain results clearly to collaborators and clients.
 You want to better understand statistical methods and implications.
Prerequisites

A working knowledge of Python and basic statistics concepts (mean, standard deviation, median, etc.)

All of the coding exercises in the course will be hosted on JupyterHub, and we'll send the URL out at the start of class. Purely browserbased, no installations required.
Recommended preparation:
Losing your Loops: Fast Numerical Computing with NumPy (video)
"Classes and Methods" (chapter in Think Python)
About your instructor

Allen Downey is a professor of Computer Science at Olin College and the author of a series of free, opensource textbooks related to software and data science, including Think Python, Think Bayes, and Think Complexity, published by O’Reilly Media. His blog, Probably Overthinking It, features articles on Bayesian probability and statistics. He holds a Ph.D. in computer science from U.C. Berkeley, and M.S. and B.S. degrees from MIT. He lives near Boston, MA with his wife and two daughters.
Schedule
The timeframes are only estimates and may vary according to how the class is progressing
Introduction: What is statistical inference? (10 minutes)
 Lecture: The pvalue ban and why most published research findings are false; inference example—drug testing
 Q&A
Effect size (50 minutes)
 Lecture: Report effect size first (everything else is secondary)
 Handson exercises: Explore the difference in means, absolute and relative difference, and Cohen’s effect size; explore the difference in proportions, odds ratios, and log odds ratios
 Q&A
Break (10 minutes)
Quantifying precision (60 minutes)
 Lecture: Sampling bias, measurement error, and random error; sampling statistics and sampling distributions; differences in standard deviation and standard error; quantifying precision; the limitations of confidence intervals
 Handson exercises: Generate sampling distributions by simulation; estimate sampling distributions by resampling; use the Resampler framework to compute the sampling distribution for Cohen’s effect size
 Q&A
Break (10 minutes)
Hypothesis testing (50 minutes)
 Lecture: The logic of the null hypothesis significance test (NHST); interpreting pvalues; the limitations of hypothesis testing
 Handson exercises: Test difference in means by permutation; use the HypothesisTest framework
 Q&A
Other resources, wrapup, and Q&A (20 minutes)