Naive Bayes machine learning with Spark 2.0 MLlib

In this recipe, we use the famous Iris dataset and use Apache Spark API NaiveBayes() to classify/predict which of the three classes of flower a given set of observations belongs to. This is an example of a multi-class classifier and requires multi-class metrics for measurements of fit. The previous recipe used a binary classification and metric to measure the fit.

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.