Documents and DTDs

To be perfectly correct, we must explain that “XML” has come to mean many subtly different things. An “XML document” is a document containing content that conforms to a markup language defined from the XML standard. An " XML Document Type Definition” (XML DTD) is a set of rules — more formally known as " entity and element declarations” — that define an XML markup language; i.e., how the tags are arranged in a correct (“valid”) XML document. To make things even more confusing, entity and element declarations may appear in an XML document itself, as well as within an XML DTD.

An XML document contains character data, which consists of plain content and markup in the form of tags and XML declarations. Thus:

<blah>harrumph</blah>

is a line in a well-formed XML document. Well-formed XML documents follow certain rules, such as the requirement for every tag to have a closing tag. These rules are presented in the context of XHTML in Chapter 16.

To be considered valid -- a valid XML document conforms to a DTD — every XML document must have a corresponding set of XML declarations that define how the tags and content should be arranged within it. These declarations may be included directly in the XML document, or they may be stored separately in an XML DTD. If an XML DTD exists that defines the <blah> tag, our well-formed XML document is valid, provided you preface it with a <!DOCTYPE> tag that explains where to find the appropriate DTD:

<?xml version="1.0"?> <!DOCTYPE blah ...

Get HTML & XHTML: The Definitive Guide, 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.