You are previewing Python for Unix and Linux System Administration.

Python for Unix and Linux System Administration

Cover of Python for Unix and Linux System Administration by Noah Gift... Published by O'Reilly Media, Inc.
  1. Copyright
  2. Foreword
  3. Preface
    1. P2.1. Conventions Used in This Book
    2. P2.2. Using Code Examples
    3. P2.3. Safari® Books Online
    4. P2.4. How to Contact Us
    5. P2.5. Acknowledgments
      1. P2.5.1. Noah’s Acknowledgments
      2. P2.5.2. Jeremy’s Acknowledgments
  4. 1. Introduction
    1. 1.1. Why Python?
    2. 1.2. Motivation
    3. 1.3. The Basics
    4. 1.4. Executing Statements in Python
      1. 1.4.1. Summary
    5. 1.5. Using Functions in Python
    6. 1.6. Reusing Code with the Import Statement
  5. 2. IPython
    1. 2.1. Installing IPython
    2. 2.2. Basic Concepts
      1. 2.2.1. Interacting with IPython
      2. 2.2.2. Tab Completion
      3. 2.2.3. Magic Edit
      4. 2.2.4. Configuring IPython
    3. 2.3. Help with Magic Functions
    4. 2.4. Unix Shell
      1. 2.4.1. alias
      2. 2.4.2. Shell Execute
      3. 2.4.3. rehash
      4. 2.4.4. rehashx
      5. 2.4.5. cd
      6. 2.4.6. bookmark
      7. 2.4.7. dhist
      8. 2.4.8. pwd
      9. 2.4.9. Variable Expansion
      10. 2.4.10. String Processing
      11. 2.4.11. sh Profile
    5. 2.5. Information Gathering
      1. 2.5.1. page
      2. 2.5.2. pdef
      3. 2.5.3. pdoc
      4. 2.5.4. pfile
      5. 2.5.5. pinfo
      6. 2.5.6. psource
      7. 2.5.7. psearch
      8. 2.5.8. who
      9. 2.5.9. who_ls
      10. 2.5.10. whos
      11. 2.5.11. History
    6. 2.6. Automation and Shortcuts
      1. 2.6.1. alias
      2. 2.6.2. macro
      3. 2.6.3. store
      4. 2.6.4. reset
      5. 2.6.5. run
      6. 2.6.6. save
      7. 2.6.7. rep
    7. 2.7. Summary
  6. 3. Text
    1. 3.1. Python Built-ins and Modules
      1. 3.1.1. str
      2. 3.1.2. re
      3. 3.1.3. Apache Config File Hacking
      4. 3.1.4. Working with Files
      5. 3.1.5. Standard Input and Output
      6. 3.1.6. StringIO
      7. 3.1.7. urllib
    2. 3.2. Log Parsing
    3. 3.3. ElementTree
    4. 3.4. Summary
  7. 4. Documentation and Reporting
    1. 4.1. Automated Information Gathering
      1. 4.1.1. Receiving Email
    2. 4.2. Manual Information Gathering
    3. 4.3. Information Formatting
      1. 4.3.1. Graphical Images
      2. 4.3.2. PDFs
    4. 4.4. Information Distribution
      1. 4.4.1. Sending Email
      2. 4.4.2. Trac
    5. 4.5. Summary
  8. 5. Networking
    1. 5.1. Network Clients
      1. 5.1.1. socket
      2. 5.1.2. httplib
      3. 5.1.3. ftplib
      4. 5.1.4. urllib
      5. 5.1.5. urllib2
    2. 5.2. Remote Procedure Call Facilities
      1. 5.2.1. XML-RPC
      2. 5.2.2. Pyro
    3. 5.3. SSH
    4. 5.4. Twisted
    5. 5.5. Scapy
    6. 5.6. Creating Scripts with Scapy
  9. 6. Data
    1. 6.1. Introduction
    2. 6.2. Using the OS Module to Interact with Data
    3. 6.3. Copying, Moving, Renaming, and Deleting Data
    4. 6.4. Working with Paths, Directories, and Files
    5. 6.5. Comparing Data
      1. 6.5.1. Using the filecmp Module
    6. 6.6. Merging Data
      1. 6.6.1. MD5 Checksum Comparisons
    7. 6.7. Pattern Matching Files and Directories
    8. 6.8. Wrapping Up rsync
    9. 6.9. Metadata: Data About Data
    10. 6.10. Archiving, Compressing, Imaging, and Restoring
    11. 6.11. Using tarfile Module to Create TAR Archives
    12. 6.12. Using a tarfile Module to Examine the Contents of TAR Files
  10. 7. SNMP
    1. 7.1. Introduction
    2. 7.2. Brief Introduction to SNMP
      1. 7.2.1. SNMP Overview
      2. 7.2.2. SNMP Installation and Configuration
    3. 7.3. IPython and Net-SNMP
    4. 7.4. Discovering a Data Center
    5. 7.5. Retrieving Multiple-Values with Net-SNMP
      1. 7.5.1. Finding Memory
    6. 7.6. Creating Hybrid SNMP Tools
    7. 7.7. Extending Net-SNMP
    8. 7.8. SNMP Device Control
    9. 7.9. Enterprise SNMP Integration with Zenoss
      1. 7.9.1. Zenoss API
  11. 8. OS Soup
    1. 8.1. Introduction
    2. 8.2. Cross-Platform Unix Programming in Python
      1. 8.2.1. Using SSH Keys, NFS-Mounted Source Directory, and Cross-Platform Python to Manage Systems
      2. 8.2.2. Creating a Cross-Platform, Systems Management Tool
      3. 8.2.3. Creating a Cross-Platform Build Network
    3. 8.3. PyInotify
    4. 8.4. OS X
      1. 8.4.1. Scripting DSCL or Directory Services Utility
      2. 8.4.2. OS X Scripting APIs
      3. 8.4.3. Automatically Re-Imaging Machines
      4. 8.4.4. Managing Plist Files from Python
    5. 8.5. Red Hat Linux Systems Administration
    6. 8.6. Ubuntu Administration
    7. 8.7. Solaris Systems Administration
    8. 8.8. Virtualization
      1. 8.8.1. VMware
    9. 8.9. Cloud Computing
      1. 8.9.1. Amazon Web Services with Boto
      2. 8.9.2. Google App Engine
    10. 8.10. Using Zenoss to Manage Windows Servers from Linux
  12. 9. Package Management
    1. 9.1. Introduction
    2. 9.2. Setuptools and Python Eggs
    3. 9.3. Using easy_install
    4. 9.4. easy_install Advanced Features
      1. 9.4.1. Search for Packages on a Web Page
      2. 9.4.2. Install Source Distribution from URL
      3. 9.4.3. Install Egg Located on Local or Network Filesystem
      4. 9.4.4. Upgrading Packages
      5. 9.4.5. Install an Unpacked Source Distribution in Current Working Directory
      6. 9.4.6. Extract Source Distribution to Specified Directory
      7. 9.4.7. Change Active Version of Package
      8. 9.4.8. Changing Standalone .py File into egg
      9. 9.4.9. Authenticating to a Password Protected Site
      10. 9.4.10. Using Configuration Files
      11. 9.4.11. Easy Install Advanced Features Summary
    5. 9.5. Creating Eggs
    6. 9.6. Entry Points and Console Scripts
    7. 9.7. Registering a Package with the Python Package Index
      1. 9.7.1. Where Can I Learn More About …
    8. 9.8. Distutils
    9. 9.9. Buildout
    10. 9.10. Using Buildout
    11. 9.11. Developing with Buildout
    12. 9.12. virtualenv
      1. 9.12.1. Creating a Custom Bootstrapped Virtual Environment
    13. 9.13. EPM Package Manager
      1. 9.13.1. EPM Package Manager Requirements and Installation
      2. 9.13.2. Creating a Hello World Command-Line Tool to Distribute
      3. 9.13.3. Creating Platform-Specific Packages with EPM
      4. 9.13.4. Making the Package
      5. 9.13.5. EPM Summary: It Really Is That Easy
  13. 10. Processes and Concurrency
    1. 10.1. Introduction
    2. 10.2. Subprocess
      1. 10.2.1. Using Return Codes with Subprocess
    3. 10.3. Using Supervisor to Manage Processes
    4. 10.4. Using Screen to Manage Processes
    5. 10.5. Threads in Python
      1. 10.5.1. Timed Delay of Threads with threading.Timer
      2. 10.5.2. Threaded Event Handler
    6. 10.6. Processes
    7. 10.7. Processing Module
    8. 10.8. Scheduling Python Processes
    9. 10.9. daemonizer
    10. 10.10. Summary
  14. 11. Building GUIs
    1. 11.1. GUI Building Theory
    2. 11.2. Building a Simple PyGTK App
    3. 11.3. Building an Apache Log Viewer Using PyGTK
    4. 11.4. Building an Apache Log Viewer Using Curses
    5. 11.5. Web Applications
    6. 11.6. Django
      1. 11.6.1. Apache Log Viewer Application
      2. 11.6.2. Simple Database Application
    7. 11.7. Conclusion
  15. 12. Data Persistence
    1. 12.1. Simple Serialization
      1. 12.1.1. Pickle
      2. 12.1.2. cPickle
      3. 12.1.3. shelve
      4. 12.1.4. YAML
      5. 12.1.5. ZODB
    2. 12.2. Relational Serialization
      1. 12.2.1. SQLite
      2. 12.2.2. Storm ORM
      3. 12.2.3. SQLAlchemy ORM
    3. 12.3. Summary
  16. 13. Command Line
    1. 13.1. Introduction
    2. 13.2. Basic Standard Input Usage
    3. 13.3. Introduction to Optparse
    4. 13.4. Simple Optparse Usage Patterns
      1. 13.4.1. No Options Usage Pattern
      2. 13.4.2. True/False Usage Pattern
      3. 13.4.3. Counting Options Usage Pattern
      4. 13.4.4. Choices Usage Pattern
      5. 13.4.5. Option with Multiple Arguments Usage Pattern
    5. 13.5. Unix Mashups: Integrating Shell Commands into Python Command-Line Tools
      1. 13.5.1. Kudzu Usage Pattern: Wrapping a Tool in Python
      2. 13.5.2. Hybrid Kudzu Design Pattern: Wrapping a Tool in Python, and Then Changing the Behavior
      3. 13.5.3. Hybrid Kudzu Design Pattern: Wrapping a Unix Tool in Python to Spawn Processes
    6. 13.6. Integrating Configuration Files
    7. 13.7. Summary
  17. 14. Pragmatic Examples
    1. 14.1. Managing DNS with Python
    2. 14.2. Using LDAP with OpenLDAP, Active Directory, and More with Python
      1. 14.2.1. Importing an LDIF File
    3. 14.3. Apache Log Reporting
    4. 14.4. FTP Mirror
  18. A. Callbacks
  19. Index
  20. B. Colophon
O'Reilly logo

Chapter 12. Data Persistence

Data persistence, in a simple, generic sense, is saving data for later use. This implies that the data, once saved for later, will survive if the process that saved it terminates. This is typically accomplished by converting the data to some format and then writing that data to disk. Sometimes, the format is human readable, such as XML or YAML. Other times, the format is not usable directly by humans, such as a Berkeley DB file (bdb) or a SQLite database.

What kind of data might you need to save for later? Perhaps you have a script that keeps track of the last modified date of the files in a directory and you need to run it occasionally to see which files have changed since the last time you ran it. The data about the files is something you want to save for later, where later is the next time you run the script. You could store this data in some kind of persistent data file. In another scenario, you have one machine that has potential network issues and you decide to run a script every 15 minutes to see how quickly it pings a number of other machines on the network. You could store the ping times in a persistent data file for later use. Later in this case has more to do with when you plan on examining the data, rather than when the program that gathered the data needs access to it.

We will be breaking this discussion of serialization into two categories: simple and relational.

Simple Serialization

There are a number of ways of storing data to disk for later ...

The best content for your career. Discover unlimited learning on demand for around $1/day.