How it works...

In this example, we explored feature scaling which is a critical step in most machine learning algorithms such as classifiers. We started out by loading the wine data files, extracted an identifier, and used the next three columns to create a feature vector.

Then, we created a MinMaxScaler object, configuring a minimum and maximum range to scale our values into. We invoked the scaling model by executing the fit() method on the scaler class, and then we used the model to scale the values in our DataFrame.

Finally, we displayed the resulting DataFrame and we noticed feature vector values ranges are between negative 1 and positive 1.

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.