More about the Bisecting KMeans can be found at:
- http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.clustering.BisectingKMeans
- http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.clustering.BisectingKMeansModel
We use clustering to explore the data and get a feel for what the outcome looks like as clusters. The bisecting KMeans is an interesting case of hierarchical analysis versus KMeans clustering.
The best way to conceptualize it is to think of bisecting KMeans as a recursive hierarchical KMeans. The bisecting KMeans algorithm divides the data using similarity measurement techniques like KMeans, but uses a hierarchical scheme to increase accuracy. It is particularly prevalent ...