In this chapter, we looked at:

- Partition-based clustering.
- The k-means algorithm is a partition-based clustering algorithm. The centroids of clusters are defined as a representative of each cluster. In k-means clustering, a set of
*n*data points in a D-dimensional space and an integer*k*are given. The problem is to distribute a set of*k*points in the centers to minimize the SSE. - The k-medoids algorithm is a partition-based clustering algorithm. The representatives of each resulting clusters are chosen from the dataset itself, that is, the data objects belong to it.
- CLARA depends on sampling. It draws a sample from the original dataset instead of the entire dataset. PAM is then applied to each sampling. Then, the best result is kept during all ...

Start Free Trial

No credit card required