Pretty-Print XML Using a Generic Identity Stylesheet and Xalan

Sometimes your XML output from various programs is less than attractive. Spruce it up in a hurry with Xalan C++ and an identity transform.

In earlier hacks ( [Hack #5] and [Hack #21] ), you saw the unsightly XML output from Word 2003 and CSVToXML. The reason why this XML is unsightly is that it is output on only one or two lines. If you want this XML to be human-readable, here is a quick hack that pretty-prints the XML by properly indenting it.

In the working directory for this book you will find identity.xsl (Example 3-13), a very simple identity stylesheet that effectively copies all nodes from source to the result as XML.

Example 3-13. identity.xsl

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" encoding="ISO-8859-1"/>
   
<xsl:template match="@*|node( )">
 <xsl:copy>
  <xsl:apply-templates select="@*|node( )"/>
 </xsl:copy>
</xsl:template>
   
</xsl:stylesheet>

The template matches all nodes (node()), including all attributes (@*), and copies them using the copy instruction, repeatedly applying templates until all nodes are processed. The indent attribute on the output element indents the output using a processor-dependent number of spaces. If you apply this stylesheet using Xalan C++ on the command line, you can use the -i switch to specify the number of spaces to indent the output. If you use Saxon, you can use the Saxon extension saxon:indent-spaces ...

Get XML Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.