In this recipe, we'll see how to explore data.
To step through this recipe, you will need a running Spark cluster in any one of the modes, that is, local, standalone, YARN, and Mesos. For installing Spark on a standalone cluster, please refer to http://spark.apache.org/docs/latest/spark-standalone.html. Also, include the Spark MLlib package in the
build.sbt file so that it downloads the related libraries and the API can be used. Install Hadoop (optionally), Scala, and Java.
/*Summary statistics*/ val summary = selected_Data.describe() println("Summary ...