Chapter 5

Optimization Clustering Techniques

5.1 Introduction

In this chapter we consider a class of clustering techniques which produce a partition of the individuals into a specified number of groups, by either minimizing or maximizing some numerical criterion. Such optimization methods differ from the methods described in the previous chapter in not necessarily forming hierarchical classifications of the data. Differences between the methods in this class arise both because of the variety of clustering criteria that might be optimized and the various optimization algorithms that might be used. In the initial discussion of these methods it is assumed that the number of groups has been fixed by the investigator. Techniques potentially useful for suggesting the ‘correct’ number of clusters are described in Section 5.5.

The basic idea behind the methods to be described in this chapter is that associated with each partition of the n individuals into the required number of groups, g, is an index c(n, g), the value of which measures some aspect of the ‘quality’ of this particular partition. For some indices high values are associated with a desirable cluster solution, whereas for others a low value is sought (see later). Associating an index with each partition allows them to be compared. A variety of such clustering criteria have been suggested. Some operate on the basis of the inter-individual dissimilarities; others employ the original data matrix.

We start by introducing cluster ...

Get Cluster Analysis, 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.