Chapter 7.  Dealing with Outliers Using Python

A certain percentage of all data will consist of what is referred to as outliers--those points or responses beyond reasonable ranges established for the data, based upon its context. General responses to found outliers become increasingly challenging within big data initiatives.

In this chapter, we will focus on the topic of dealing with outliers as they relate to big data visualization, introduce the Python language, and offer working examples demonstrating solutions for effectively dealing with data outliers and other anomalies in big data, using Python.

This chapter is organized into the following main sections:

  • About Python
  • Python and big data
  • Outliers
  • Some basic examples
  • More examples

About Python

Get Big Data Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.