12.9 Summary

■ Assume that a given statistical process is used to generate a set of data objects. An outlier is a data object that deviates significantly from the rest of the objects, as if it were generated by a different mechanism.

■ Types of outliers include global outliers, contextual outliers, and collective outliers. An object may be more than one type of outlier.

■ Global outliers are the simplest form of outlier and the easiest to detect. A contextual outlier deviates significantly with respect to a specific context of the object (e.g., a Toronto temperature value of 28° C is an outlier if it occurs in the context of winter). A subset of data objects forms a collective outlier if the objects as a whole deviate significantly from the entire ...

Get Data Mining: Concepts and Techniques, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.