
Another metric that provides an error measure of a linear regression model is called the R2 (R-squared) metric. The R2 metric represents the proportion of variance in the dependent variable explained by the independent variable(s). The equation for calculating R2 is as follows:

In this equation, SST refers to the Total Sum of Squares, which is just the SSE from the overall mean (as illustrated in Figure 4.1 by the red horizontal line, which is often used as a baseline model). An R2 value of 0 implies a linear regression model that provides no improvement over the baseline model (in other words, SSE = SST). An R2 value of 1 implies ...

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.