Correlating a binary and a continuous variable with the point biserial correlation

The point-biserial correlation correlates a binary variable Y and a continuous variable X. The coefficient is calculated as follows:

Correlating a binary and a continuous variable with the point biserial correlation

The subscripts in (3.21) correspond to the two groups of the binary variable. M1 is the mean of X for values corresponding to group 1 of Y. M2 is the mean of X for values corresponding to group 0 of Y.

In this recipe, the binary variable we will use is rain or no rain. We will correlate this variable with temperature.

How to do it...

We will calculate the correlation with the scipy.stats.pointbiserialr() function. We will also compute the ...

Get Python Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.