22.8. Programming with XML Documents

Right at the beginning of this chapter I introduced the notion of an XML processor as a module that is used by an application to read XML documents. An XML processor parses the contents of a document and makes the elements, together with their attributes and content, available to the application, so it is also referred to as an XML parser. In case you haven't met the term before, a parser is just a program module that breaks down text in a given language into its component parts. A natural language processor would have a parser that identifies the grammatical segments in each sentence. A compiler has a parser that identifies variables, constants, operators, and so on in a program statement. An application accesses the content of a document through an API provided by an XML parser and the parser does the job of figuring out what the document consists of.

Java supports two complementary APIs for processing an XML document:

  • SAX, which is the Simple API for XML parsing

  • DOM, which is the Document Object Model for XML

The support in JDK 5.0 is for DOM level 3 and for SAX version 2.0.2. JDK 5.0 also supports XSLT version 1.0, where XSL is the Extensible Stylesheet Language and T is Transformations—a language for transforming one XML document into another, or into some other textual representation such as HTML. However, I'll concentrate on the basic application of DOM and SAX. XSLT is such an extensive topic that there are several books devoted entirely ...

Get Ivor Horton's Beginning Java™ 2, JDK™ 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.