Chapter 9. The Correlation Coefficient

This chapter is concerned with measures of relatedness between two variables. A simple measure, the correlation coefficient, is commonly used to quantify the degree of association between two variables. Often, correlations are used during an exploratory or observational stage of research to determine which variables at least have a statistical relationship with each other. In experimental designs, correlations are also used to determine the degree of association between independent and dependent (or response) variables. However, the finding of a correlation between two variables does not imply that a change in one variable causes a corresponding change in another—that’s why you still need experiments. Indeed, the history of computing correlation coefficients at large, and often without any theoretical or model-based justification, has led to numerous errors in inference being made. In this chapter, you will learn about measures of association, such as Pearson’s correlation coefficient, the Spearman rank-order coefficient, the point-biserial correlation coefficient, and phi, and review examples of the appropriate use of each. The key message is that correlations are useful tools, but many variables in nature are correlated; such relationships are not always useful for inference.

Measuring Association

The world is awash with correlations, or statistical associations between two (or more) variables. Often, such relationships are useful to characterize ...

Get Statistics in a Nutshell now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.