Checking a Range Based on the Interquartile Range

Yet another way to look for outliers is a method devised by advocates of exploratory data analysis (EDA). This is a robust method, much like the previous method described, based on a trimmed mean. It uses the interquartile range (the distance from the 25th percentile to the 75th percentile) and defines an outlier as a multiple of the interquartile range above or below the upper or lower hinge, respectively. For those not familiar with EDA terminology, the lower hinge is the value corresponding to the 25th percentile (the value below which 25% of the data values lie). The upper hinge is the value corresponding to the 75% percentile. For example, you may want to examine any data values more than ...

Get Cody’s Data Cleaning Techniques Using SAS® Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.