A clustering algorithm, such as K-means, locates the centroid of the group of data points. However, to make clustering accurate and effective, the algorithm evaluates the distance between each point from the centroid of the cluster.
Eventually, the goal of clustering is to determine intrinsic grouping in a set of unlabeled data. For example, the K-means algorithm tries to cluster related data points within the predefined three (that is, k = 3) clusters as shown in Figure 8:
In our case, using a combined approach of Spark, ...