Chapter 11

An Overview of Exploratory Data Analysis (EDA)

In This Chapter

arrow Seeing how the focus of EDA is different from traditional statistical analysis

arrow Exploring important graphical EDA techniques

arrow Understanding key quantitative EDA techniques

Exploratory data analysis (EDA) is an approach to data analysis that lets you determine a dataset’s properties so you can use the appropriate technique or techniques for analyzing the data. This helps ensure that you won’t impose assumptions on the data that aren’t warranted. Unlike more traditional approaches, which impose a specific model on a dataset based on predetermined assumptions, with EDA the structure of a dataset determines which techniques should be used to analyze the data.

technicalstuff The field of EDA was introduced in 1977 by the mathematician John Tukey. Tukey believed that it was important to let a dataset determine the types of analysis that should be performed on it, rather than look for confirmation of predetermined assumptions.

EDA is designed to accomplish several important objectives:

  • Understanding the properties of a dataset ...

Get Statistics for Big Data For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.