GLOSSARY

2 × 2 contingency table
a contingency table having one variable with two possible values and one outcome with two possible values.
2 × 5 contingency table
a contingency table having one variable with two possible values and one outcome with five possible values.
Alternative hypothesis
in a hypothesis test, the claim you accept as true only if you have enough evidence in the data to reject the null hypothesis.
Anderson–Darling test
a hypothesis test to determine if a sample conforms to a specific probability distribution, the normal distribution in particular.
Anomaly detection
the process of identifying outliers or other unusual observations.
Average
a measure of the central value in a set of observations. Also called a sample mean. Calculated by adding all the values and dividing by the sample size.
Bimodal distribution
a split frequency distribution, where observations are clustered around each of two central values.
Binomial distribution
a probability distribution describing the number of successes in a fixed set of independent trials.
Blocking
in a study, the process of collecting data in groups, where each group is as homogeneous as possible; a sampling technique to minimize the impact of potentially confounding factors.
Bootstrapping
a data-based method for assessing the accuracy of an estimate, whether it’s the variance, bias, or a confidence interval. Based on resampling.
Boxplot
A graphical representation of a five-number summary, illustrating the minimum, ...

Get Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.