Moving from HTML to XHTML

Most of the changes required to turn an existing HTML document into an XHTML document involve making the document well-formed. For instance, given a legacy HTML document, you’ll probably have to make at least some of these changes to turn it into XHTML:

  • Add missing end-tags like </p> and </li>.

  • Rewrite elements so that they nest rather than overlap. For example, change <p><em>an emphasized paragraph</p></em> to <p><em>an emphasized paragraph</em></p>.

  • Put double or single quotes around attribute values. For example, change <p align=center> to <p align="center">.

  • Add values (which are the same as the name) to all minimized Boolean attributes. For example, change <input type="checkbox" checked> to <input type="checkbox" checked="checked">.

  • Replace any occurrences of & or < in character data or attribute values with &amp; and &lt;. For instance, change A & P to A &amp; P and <a href="http://www.google.com/search?client=googlet & q=Java%20XML"> to <a href="http://www.google.com/search?client=googlet &amp; q=Java%20XML">.

  • Make sure the document has a single root html element.

  • Change empty elements like <hr> to <hr /> or <hr></hr>.

  • Add hyphens to comments so that <! this is a comment> becomes <!-- this is a comment -->.

  • Encode the document in UTF-8 or UTF-16, or add an XML declaration that specifies in which character set it is encoded.

XHTML doesn’t merely require well-formedness; it also requires validity. In order to create a valid XHTML document, you’ll need to make ...

Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.