Preprocessing recapitulation

The following table recapitulates the different issues that one can find in raw data and whether Amazon ML offers ways to deal with them:

Linear model sensitivity Available on Amazon ML
Missing values Yes Dealt with automatically
Standardization Yes z-score standardization
Outliers Yes Quantile binning
Multicollinearity Yes No
Imbalanced datasets Yes Uses the right metric F1 Score No sampling strategy (may exist in background)
Non linearities Yes Quantile binning

Get Effective Amazon Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.