Best practices for statistics

Statistics are an integral part of any predictive modelling assignment. Statistics are important because they help us gauge the efficiency of a model. Each predictive model generates a set of statistics, which suggests how good the model is and how the model can be fine-tuned to perform better. The following is a summary of the most widely reported statistics and their desired values for the predictive models described in this book:

Algorithms

Statistics/Parameter

The desired value of statistics

Linear regression

R2, p-values, F-statistic, and Adj. R2

High Adj. R2, low F-statistic, and low p-value

Logistic regression

Sensitivity, specificity, Area Under the Curve (AUC), and KS statistic

High AUC (proximity ...

Get Python: Data Analytics and Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.