Pyxie

The Pyxie package, developed by Sean McGrath, is available from http://pyxie.sourceforge.net/ and is based around a line-oriented notation known as PYX. PYX and Pyxie are an alternative to the SAX and DOM, and is, according to its author, geared for pipeline processing, in which one application’s output is fed as input to the next application. This idiom is common among Unix tools, but is also used on Windows, though it is not common there for end-user tools.

Pyxie can parse an XML document into a line-oriented format known as PYX, which give signals as to the content of the document. It’s similar to SAX in that it is event-driven; however, instead of implementing callback interfaces, the events are dumped to standard output as PYX notation. The PYX output can then be processed by other text manipulation tools such as grep, sed, and awk, or fed into other text-aware scripts you might write with Python and Perl.

PYX output appears as individual lines representing different types of markup. Consider the following XML:

<Book>
  <Name>Python and XML</Name>
  <Publisher>O'Reilly &amp; Associates</Publisher>
</Book>

The above XML would be converted to the following PYX using Pyxie or other PYX aware processors:

(Book
-\n
(Name
-Python and XML
)Name
-\n
(Publisher
-O'Reilly & Associates
)Publisher
-\n
)Book

One thing to note about the PYX output is that each document construct that is being dealt with is given its own line. This makes it very accommodating to Unix-style command-line processing ...

Get Python & XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.