How it works...

The data is the same as the data in the previous recipe, but we use Random Forest and the Multi metrics API to solve the classification problem:

  • RandomForest.trainClassifier()
  • MulticlassMetrics()

We have a lot of options with Random Forest Trees that we can adjust to get the right edges for classifying complex surfaces. Some of the parameters are listed here:

 val numClasses = 2 val categoricalFeaturesInfo = Map[Int, Int]() val numTrees = 3 // Use more in practice. val featureSubsetStrategy = "auto" // Let the algorithm choose. val maxDepth = 4 val maxBins = 32

Noteworthy is the confusion matrix in this recipe. The confusion matrix is obtained via the MulticlassMetrics() API call. To interpret the preceding confusion metrics, ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.