19.0 Introduction

Extensible Markup Language (XML) has become the foundation of much of the software development done today. XML is a document markup standard defined by the World Wide Web Consortium (W3C). Elements in XML are sets of tags that encapsulate textual data. Elements can be nested inside others, but they have to be balanced: you can’t open an element in one parent element and close it in another. For example, the following XML is invalid because the child element’s closing tag is outside the parent element’s closing tag:

<parent>
    <child>Some text in the child
</parent>
</child>

This snippet, however, is valid, since child opens and closes inside parent:

<parent>
    <child>Some text in the child</child>
</parent>

XML is free-formed, in that there’s no requirement for a particular set of tags to be used. You can use any set of tags and create any structure that makes sense for your application. Your XML document is said to be well formed if it’s balanced and has one single root element.

While an XML document’s structure can be free-formed, you also have the option to define a specific structure for the XML using a Document Type Definition (DTD). An XML document that meets the structure rules laid out in a DTD is said to be valid. XML Schema (XSD), another way to define XML structure, is increasingly becoming favored over DTDs.

A parser reads the XML document, pulls in any external files, verifies that it’s well formed, validates it against any applicable DTDs or schemas, and passes ...

Get Windows Developer Power Tools now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.