Chapter 5

Exploring Data Analysis

IN THIS CHAPTER

check Understanding the exploratory data analysis (EDA) philosophy

check Describing numeric and categorical distributions

check Estimating correlation and association

check Testing mean differences in groups

check Visualizing distributions, relationships, and groups

“If you torture the data long enough, it will confess.”

— RONALD COASE

Data science relies on complex algorithms for building predictions and spotting important signals in data, and each algorithm presents different strong and weak points. In short, you select a range of algorithms, you have them run on the data, you optimize their parameters as much as you can, and finally you decide which one will best help you build your data product or generate insight into your problem.

It sounds a little bit automatic and, partially, it is, thanks to powerful analytical software and scripting languages like Python. Learning algorithms are complex, and their sophisticated procedures naturally seem automatic and ...

Get Coding All-in-One For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.