O'Reilly logo

Statistics for Big Data For Dummies by David Semmelroth, Alan Anderson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 21

Ten (or So) Questions Answered by Exploratory Data Analysis (EDA)

In This Chapter

arrow Understanding the most important questions answered by Exploratory Data Analysis (EDA)

arrow Seeing how to use EDA to determine if a dataset conforms to your assumptions

This chapter covers ten key questions about a dataset that can be answered by using exploratory data analysis (EDA). These questions focus on the statistical properties of the data, along with the distribution followed by the data and the nature of the relationships among the variables in the data.

What Are the Key Properties of a Dataset?

Prior to performing any type of statistical analysis, understanding the nature of the data being analyzed is essential. You can use EDA to identify the properties of a dataset to determine the most appropriate statistical methods to apply to the data. You can investigate several types of properties with EDA techniques, including the following:

  • The center of the data
  • The spread among the members of the data
  • The skewness of the data
  • The probability distribution the data follows
  • The correlation among the elements in the dataset
  • Whether or not the parameters of the data are constant over time
  • The presence of outliers in the data

Chapter 5 introduces most of these notions. Chapter 16 talks ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required