An overview of clustering

Clustering is a division of data into groups of similar objects. Each object (cluster) consists of objects that are similar between themselves and dissimilar to objects of other groups. The goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. Clustering can be used in varied areas of application from data mining (DNA analysis, marketing studies, insurance studies, and so on.), text mining, information retrieval, statistical computational linguists, and corpus-based computational lexicography. Some of the requirements that must be fulfilled by clustering algorithms are as follows:

  • Scalability
  • Dealing with various types of attributes
  • Discovering clusters of arbitrary shapes
  • The ability to deal ...

Get Practical Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.