22.5. Rules for a Well-Formed Document

Now that you know a bit more about XML elements and what goes into a DTD, I can formulate what you must do to ensure your XML document is well-formed. The rules for a document to be well-formed are quite simple:

  1. If the XML declaration appears in the prolog, it must include the XML version. Other specifications in the XML document must be in the prescribed sequence—character encoding followed by standalone specification.

  2. If the document type declaration appears in the prolog, the DOCTYPE name must match that of the root element, and the markup declarations in the DTD must be according to the rules for writing markup declarations.

  3. The body of the document must contain at least one element, the root element, which contains all the other elements, and an instance of the root element must not appear in the content of another element. All elements must be properly nested.

  4. Elements in the body of the document must be consistent with the markup declarations identified by the DOCTYPE declaration.

The rules for writing an XML document are absolutely strict. Break one rule and your document is not well-formed and will not be processed. This strict application of the rules is essential because you are communicating data and its structure. If any laxity were permitted, it would open the door to uncertainty about how the data should be interpreted. HTML used to be quite different from XML in this respect. Until recently, the rules for writing HTML were only ...

Get Ivor Horton's Beginning Java™ 2, JDK™ 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.