Chapter 2. Introductory Examples

This book teaches you the Python tools to work productively with data. While readers may have many different end goals for their work, the tasks required generally fall into a number of different broad groups:

Interacting with the outside world

Reading and writing with a variety of file formats and databases.

Preparation

Cleaning, munging, combining, normalizing, reshaping, slicing and dicing, and transforming data for analysis.

Transformation

Applying mathematical and statistical operations to groups of data sets to derive new data sets. For example, aggregating a large table by group variables.

Modeling and computation

Connecting your data to statistical models, machine learning algorithms, or other computational tools

Presentation

Creating interactive or static graphical visualizations or textual summaries

In this chapter I will show you a few data sets and some things we can do with them. These examples are just intended to pique your interest and thus will only be explained at a high level. Don’t worry if you have no experience with any of these tools; they will be discussed in great detail throughout the rest of the book. In the code examples you’ll see input and output prompts like In [15]:; these are from the IPython shell.

Note

To follow along with these examples, you should run IPython in Pylab mode by running ipython --pylab at the command prompt.

1.usa.gov data from bit.ly

In 2011, URL shortening service bit.ly partnered with the United States government ...

Get Python for Data Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.