5Segmentation algorithms

5.1 Segmenting customers with data mining algorithms

In this chapter, we’ll focus on the data mining modeling techniques used for segmentation. Although clustering algorithms can be directly applied to input data, a recommended preprocessing step is the application of a data reduction technique that can simplify and enhance the segmentation process by removing redundant information. This approach, although optional, is highly recommended, as it can lead to rich and unbiased segmentation solutions that account for all the underlying data dimensions without being affected and biased by possible intercorrelations of the inputs. Therefore, this chapter also focuses in Principal components analysis (PCA), an established data reduction technique that can construct meaningful and uncorrelated compound measures which can then be used in clustering.

5.2 Principal components analysis

PCA is a statistical technique used for the data reduction of the original inputs. It analyzes the correlations between input fields and derives a core set of component measures that can efficiently reduce the data dimensionality without sacrificing much of the information of the original inputs.

PCA examines the correlations among the original inputs and then appropriately constructs composite measures that take these correlations into account. A brief note on linear correlations: if two or more continuous fields tend to covary, then they are correlated. If their relationship ...

Get Effective CRM using Predictive Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.