Normalizing data with Spark

In this recipe, we demonstrate normalizing (scaling) the data prior to importing the data into an ML algorithm. There are a good number of ML algorithms such as Support Vector Machine (SVM) that work better with scaled input vectors rather than with the raw values.

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.