Chapter 4. Data Representation

PALO ALTO, Calif.—Intel says its Pentium Pro and new Pentium II chips have a flaw that can cause computers to sometimes make mistakes but said the problems could be fixed easily with rewritten software.

Reuters telegram

Overview

This chapter describes a number of modules that can be used to convert between Python objects and other data representations. These modules are often used to read and write foreign file formats and to store or transfer Python variables.

Binary Data

Python provides several support modules that help you decode and encode binary data formats. The struct module can convert between binary data structures (like C structs) and Python tuples. The array module wraps binary arrays of data (C arrays) into a Python sequence object.

Self-Describing Formats

To pass data between different Python programs, you can marshal or pickle your data.

The marshal module uses a simple self-describing format that supports most built-in datatypes, including code objects. Python uses this format itself to store compiled code on disk (in PYC files).

The pickle module provides a more sophisticated format, which supports user-defined classes, self-referencing data structures, and more. This module is available in two versions; the basic pickle module is written in Python and is relatively slow, while cPickle is written in C and is usually as fast as marshal.

Output Formatting

The modules in this group supplement built-in formatting functions like repr and the % string formatting operator.

The pprint module can print almost any Python data structure in a nice, readable way (as readable as it can make things, that is).

The repr module provides a replacement for the built-in function with the same name. The version in this module applies tight limits on most things: it doesn’t print more than 30 characters from each string, it doesn’t print more than a few levels of deeply nested data structures, etc.

Encoded Binary Data

Python supports most common binary encodings, such as base64, binhex (a Macintosh format), quoted printable, and uu encoding.

Get Python Standard Library now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.