Deciding the optimum value of K is one of the tough parts while performing a k-means clustering. There are a few methods that can be used to do this.
We earlier discussed that a good cluster is defined by the compactness between the observations of that cluster. The compactness is quantified by something called intra-cluster distance. The intra-cluster distance for a cluster is essentially the sum of pair-wise distances between all possible pairs of points in that cluster.
If we denote intra-cluster distance by W, then for a cluster k intra-cluster, the distance can be denoted by:
Generally, the normalized ...