Choosing between recipes and data pre-processing.

So far we have transformed our initial dataset via scripts and Amazon ML recipes. The two techniques are complementary. Some transformation and data manipulation can only be done by preprocessing the data. We did so in Chapter 4, Loading and Preparing the Dataset with Athena and SQL. We could have achieved similar data processing with other scripting languages such as Python or R, which are most fruitful for creative feature engineering. SQL and scripts can also better deal with outliers and missing values — corrections that are not available with Amazon ML recipes.

The goal of the Amazon ML transformations is to prepare the data for consumption by the Amazon ML algorithm, whereas scripted ...

Get Effective Amazon Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.