Cover image for Think Stats

Book description

Think Stats: Probability and Statistics for Programmers is a textbook for a new kind of introductory prob-stat class. It emphasizes the use of statistics to explore large datasets. It takes a computation approach: students write programs in Python as a way of developing and testing their understanding.

Table of Contents

  1. Think Stats
  2. Preface
    1. Why I Wrote This Book
    2. How I Wrote This Book
    3. Contributor List
    4. Conventions Used in This Book
    5. Using Code Examples
    6. Safari® Books Online
    7. How to Contact Us
  3. 1. Statistical Thinking for Programmers
    1. Do First Babies Arrive Late?
    2. A Statistical Approach
    3. The National Survey of Family Growth
    4. Tables and Records
    5. Significance
    6. Glossary
  4. 2. Descriptive Statistics
    1. Means and Averages
    2. Variance
    3. Distributions
    4. Representing Histograms
    5. Plotting Histograms
    6. Representing PMFs
    7. Plotting PMFs
    8. Outliers
    9. Other Visualizations
    10. Relative Risk
    11. Conditional Probability
    12. Reporting Results
    13. Glossary
  5. 3. Cumulative Distribution Functions
    1. The Class Size Paradox
    2. The Limits of PMFs
    3. Percentiles
    4. Cumulative Distribution Functions
    5. Representing CDFs
    6. Back to the Survey Data
    7. Conditional Distributions
    8. Random Numbers
    9. Summary Statistics Revisited
    10. Glossary
  6. 4. Continuous Distributions
    1. The Exponential Distribution
    2. The Pareto Distribution
    3. The Normal Distribution
    4. Normal Probability Plot
    5. The Lognormal Distribution
    6. Why Model?
    7. Generating Random Numbers
    8. Glossary
  7. 5. Probability
    1. Rules of Probability
    2. Monty Hall
    3. Poincaré
    4. Another Rule of Probability
    5. Binomial Distribution
    6. Streaks and Hot Spots
    7. Bayes’s Theorem
    8. Glossary
  8. 6. Operations on Distributions
    1. Skewness
    2. Random Variables
    3. PDFs
    4. Convolution
    5. Why Normal?
    6. Central Limit Theorem
    7. The Distribution Framework
    8. Glossary
  9. 7. Hypothesis Testing
    1. Testing a Difference in Means
    2. Choosing a Threshold
    3. Defining the Effect
    4. Interpreting the Result
    5. Cross-Validation
    6. Reporting Bayesian Probabilities
    7. Chi-Square Test
    8. Efficient Resampling
    9. Power
    10. Glossary
  10. 8. Estimation
    1. The Estimation Game
    2. Guess the Variance
    3. Understanding Errors
    4. Exponential Distributions
    5. Confidence Intervals
    6. Bayesian Estimation
    7. Implementing Bayesian Estimation
    8. Censored Data
    9. The Locomotive Problem
    10. Glossary
  11. 9. Correlation
    1. Standard Scores
    2. Covariance
    3. Correlation
    4. Making Scatterplots in Pyplot
    5. Spearman’s Rank Correlation
    6. Least Squares Fit
    7. Goodness of Fit
    8. Correlation and Causation
    9. Glossary
  12. Index
  13. About the Author
  14. Colophon
  15. Copyright