You are previewing XML Processing with Python.
O'Reilly logo
XML Processing with Python

Book Description

  • Breakthrough techniques for building XML applications — fast!

  • Includes a detailed Python tutorial

  • Learn about DOM and SAX application development with Python

  • Exclusive coverage of the new Pyxie XML processing library

  • CD-ROM includes Python and Pyxie distributions for Windows NT and Linux—plus powerful utilities and lots of working code

  • "XML processing is the newest required skill for webmasters and application developers. The Python language and Sean McGrath's book make it fun to learn and easy to do."
    — Charles F. Goldfarb

    When it comes to XML processing, Python is in a league of its own.

    If you're doing XML development without Python, you're wasting time! Python offers outstanding productivity — especially in the areas that matter most to XML developers, such as XML parsing, DOM/SAX implementations, string processing, and Internet APIs.

    And now there's Pyxie — the new open source library that makes Python XML processing even easier and more powerful. In XML Processing with Python, top XML developer Sean McGrath delivers the hands-on explanations and examples you need to get results with Python and Pyxie fast — even if you've never used them before!

  • Install Python and the Pyxie XML package

  • Learn the fundamentals of Python: control structures, classes, nested lists, dictionaries, and regular rexpresions

  • Process XML with regular expression-driven, event-driven, and tree-driven techniques

  • Understand Python's support for DOM and SAX APIs

  • Explore the power of Python/XML through worked examples of GUI development, database integration, and an XML query-by-example implementation.

  • Elegant, easy, powerful and fun, Python helps you build world-class XML applications in less time than you ever imagined. If you know XML, one book has all the techniques, code, and tools you'll need to process it: XML Processing with Python.


    The accompanying CD-ROM contains everything you need to develop XML applications with Python — including

  • complete Python distributions for Windows and Linux

  • the Pyxie open-source libraries

  • powerful utility programs

  • an extensive library of sample source code tested on both Windows NT and Linux

  • Table of Contents

    1. Copyright
    2. Charles F. Goldfarb Series on Open Information Management
    3. The Charles F. Goldfarb Series on Open Information Management
    4. Foreword
    5. Introduction
      1. Purpose of This Book
      2. The Pyxie Open Source Project
      3. Prerequisites
      4. How to Read This Book
      5. A Note about Platforms
      6. Structure of Code Samples
      7. And Finally . . .
    6. Installing Python
      1. Getting a Python Distribution
      2. Installing the Software
      3. Testing the Python Installation
      4. Using a Python Program File
      5. In Conclusion
    7. Installing the XML Package
      1. Testing the XML Package Installation
      2. Testing the pyExpat Module
      3. Testing SAX Support
    8. Tools of the Trade
      1. The XMLN and XMLV Parsing Utilities
      2. Simple XML-Processing Tasks with XMLN and XMLV
      3. The GetURL Utility—A Web Resource Retriever in Python
      4. The PYX2XML Utility: Converting PYX to XML
      5. The C3 Utility: An XML Document Editor/Viewer in Python
      6. In Conclusion
    9. Just Enough Python
      1. Introduction
      2. Basic Control Structures
      3. Functions
      4. Modules
      5. Data Structures
      6. Object Orientation
      7. Design Principles
      8. In Conclusion
    10. Some Important Details
      1. Dealing with Long Lines
      2. Using the dir Function
      3. Working with Docstrings
      4. Importing Modules
      5. Executing Python Programs
      6. Using the Special Object None
      7. Memory Management
      8. Copying Objects
      9. Determining Object Identity
      10. Handling Errors
      11. The Dynamic Nature of Python
      12. Named Parameters
      13. The Pass Statement
      14. Packages
    11. Processing XML with Regular Expressions
      1. Command-Line Arguments
      2. A Module Test Harness for xgrep
      3. What If There Are No Command-Line Parameters?
      4. Adding Support for Wildcards
      5. Parsing Command-Line Options
      6. A Pattern-Matching Dry Run
      7. Introducing Regular Expressions
      8. Using Escape Sequences in Regular Expressions
      9. Compiling Regular Expressions
      10. Adding Regular Expressions to xgrep
      11. xgrep in Action
      12. Parsing XML with Regular Expressions
      13. Cautionary Tales
      14. Avoiding False Positive Matches
      15. Shallow Parsing XML with Python Regular Expressions
      16. Current Implementation of xgrep
    12. Event-driven XML Processing
      1. Making xgrep XML-Aware
      2. Invoking xmln from xgrep
      3. Adding PYX Support for xgrep
      4. Adding XML Search Features to xgrep
      5. Using Long Option Names in getopt
      6. Using “Bit Twiddling” to Handle the Many Options Available
      7. The Match Printing Function
      8. Some Examples
      9. Generalizing the Idea of Event-Based XML Processing
      10. A Standardized Event-Driven Processing Model
      11. Advantages and Disadvantages of Event-Driven Processing
      12. In Conclusion
    13. Tree-driven XML Processing
      1. Modelling a Node
      2. Navigating a Tree
      3. Building xTree Structures
      4. Building an xTree By Using PYX
      5. A Test Harness for Pyxie
      6. Handling Line Ends
      7. A Syntax for Tree Processing with xgrep
      8. Adding Support for Attributes
      9. Some Utility Bits and Pieces
      10. Implementing XMLGrepTree
      11. A Standardized Tree-Driven XML Processing Model
      12. Advantages and Disadvantages of Tree-Driven XML Processing
      13. Some Examples
      14. Bringing It All Together
    14. Just Enough SAX
      1. History
      2. The Concept of an “Interface”
      3. Overview of the SAX Specification
      4. The HandlerBase Class
      5. The DocumentHandler Interface
      6. The AttributeList Interface
      7. The ErrorHandler Interface
      8. A SAX Inspection Application
      9. SAX as a Source of PYX
      10. Switching SAX Parsers
    15. Just Enough DOM
      1. History
      2. DOM Support in Python
      3. The DOM Architecture
      4. Accessing an XML File with pyDOM
      5. Navigating a DOM Tree
      6. Walking a DOM Tree
      7. Accessing Attributes
      8. Manipulating Trees
      9. Accessing an HTML File with pyDOM
      10. Printing the Text of an HTML Document
      11. Changing Data Content in a DOM Tree
      12. Creating a Tree Programmatically
      13. Converting HTML to PYX by Using DOM
      14. Using PYX as a DOM Data Source
    16. Pyxie: An Open Source XML-Processing Library for Python
      1. What Is Pyxie?
      2. Design Goals
      3. PYX Notation Processing
      4. Event-driven Processing
      5. Tree-driven Processing
      6. Tree Navigation
      7. Tree Cut-and-Paste
      8. Node Lists
      9. Tree Walking
      10. Hybrid Event- or Tree-driven Processing
      11. The Invoice Printing Problem Solved Three Ways
      12. The Complete Source Code for the Pyxie Library
    17. xFS: Filesystem Information in XML
      1. A Simple XML DTD for Filesystem Information
      2. Some Python Features Used in the xFS Application
      3. Viewing xFS Data with the C3 XML Editor/Viewer
      4. Performing Filesystem Queries with xgrep
      5. Source Code for xFS
    18. xMail: E-mail as XML
      1. The rfc822 Module
      2. A Simple DTD for E-mail
      3. An Example of an E-mail Message in XML
      4. Processing a Eudora Mailbox
      5. Processing a Linux Mailbox
      6. Processing an E-mail Message by Using the rfc822 Module
      7. Sending E-mail by Using xMail
      8. Source Code for the SendxMail Application
      9. Source Code for the xMail Application
    19. xMySQL: Relational Database Harvesting with Python SAX
      1. Installing MySQL
      2. Testing the MySQL Installation
      3. Installing the Python Interface to MySQL
      4. Testing the Python Interface to MySQL
      5. Mapping Relational Data to XML
      6. The Python SAX Driver Interface
      7. Implementing a SAX Driver for MySQL
      8. A Template for SAX Drivers
      9. The MySQL SAX Driver
      10. Some Examples
    20. xTract: A Query-By-Example XML Retrieval System
      1. Expressing XML Queries
      2. The Utility in Action
      3. xTract Version 1 Source Code
      4. Handling Large XML Files with xTract
      5. Source Code for the xTract1 Utility
    21. The C3 XML Editor/Viewer
      1. Developing wxPython Applications
      2. A “Hello World” wxPython Application
      3. Converting an xTree to a wxTree, and Vice Versa
      4. Dynamic Module Loading
      5. The Complete Source Code for C3
    22. An Overview of Python for Java Programmers
      1. Comparing the Python and Java Programming Languages
    23. An Overview of Python for Perl Programmers
    24. About the CD-ROM
      1. Windows
      2. Linux
      3. Generic Unix
      4. Miscellaneous
      5. Technical Support