Summary

In this chapter, we focused on two important elements of a predictive analytics project: the data and the evaluation of the predictive power of the model. We first listed the most common problems encountered with raw data, their impact on the linear regression model, and ways to solve them. The reader should now be able to identify and deal with missing values, outliers, imbalanced datasets, and normalization.

We also introduced the two most frequent problems in predictive analytics: underfitting and overfitting. L1 and L2 regularization is an important element in the Amazon ML platform, which helps overcome overfitting and make models more robust and able to handle previously unseen data.

We are now ready to dive into the Amazon ...

Get Effective Amazon Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.