O'Reilly logo

R for Data Science by Dan Toomey

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Anomaly detection

We can use R programming to detect anomalies in a dataset. Anomaly detection can be used in a number of different areas, such as intrusion detection, fraud detection, system health, and so on. In R programming, these are called outliers. R programming allows the detection of outliers in a number of ways, as listed here:

  • Statistical tests
  • Depth-based approaches
  • Deviation-based approaches
  • Distance-based approaches
  • Density-based approaches
  • High-dimensional approaches

Show outliers

R programming has a function to display outliers: identify (in boxplot).

The boxplot function produces a box-and-whisker plot (see following graph). The boxplot function has a number of graphics options. For this example, we do not need to set any.

The identify ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required