Accessing data is a necessary first step for using most of the tools in this book. I’m going to be focused on data input and output using pandas, though there are numerous tools in other libraries to help with reading and writing data in various formats.
Input and output typically falls into a few main categories: reading text files and other more efficient on-disk formats, loading data from databases, and interacting with network sources like web APIs.
pandas features a number of functions for reading tabular data as a DataFrame
object. Table 6-1 summarizes some of
read_table are likely the ones you’ll use the
|Load delimited data from a file, URL, or file-like object; use comma as default delimiter|
|Load delimited data from a file, URL, or file-like object; use tab
|Read data in fixed-width column format (i.e., no delimiters)|
|Version of |
|Read tabular data from an Excel XLS or XLSX file|
|Read HDF5 files written by pandas|
|Read all tables found in the given HTML document|
|Read pandas data encoded using the MessagePack binary format|