4.1. Writing Standards-Compliant Web Pages

Problem

You need to create standards-compliant pages for your web site.

Solution

Add a DOCTYPE declaration to the first line of your HTML code, above the <html> tag:

	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
	 "http://www.w3.org/TR/html4/strict.dtd">

	<html>
	 <!-- Other HTML content -->
	</html>

Then validate your code using tools available online or built in to your HTML editor to check your code's conformity to the W3C specification.

Discussion

Document Type Definitions (DTDs) for web pages are published lists from the World Wide Web Consortium that declare to later versions of common web browsers the valid structure for a web page. Modern browsers use DTDs to determine how a page should be rendered. The vocal supporters of best practices in web design have elevated these W3C recommendations to standards in a virtuous effort to advance web design from the late 1990s age of proprietary tags and inconsistent page rendering among various browsers to an era when web pages look more or less the same in every browser on every platform. Universal compliance with web standards requires that web pages declare a DOCTYPE and follow the DTD's HTML markup rules to the letter to achieve the noble goal of uniform browser rendering. Web pages without DOCTYPEs—and there are millions of them with more going online every day—are just prolonging the days of the browser wars.

Hypertext Markup Language (HTML) was created in the early 1990s as an extended subset of the older Standard Generalized Markup Language (SGML), which has been used since the mid 1980s to standardize the exchange, management, and publishing of all types of electronic documents, not just web pages. Since then, the HTML specification has been revised and expanded numerous times: to HTML 2.0 in late 1995, HTML 3.2 in 1997, HTML 4 in 1999, XHTML 1.0 in 2000, and XHTML 1.1 in 2001. Each revision to the specification came with a new DTD that added to or amended those that came before it, but did not end the use of older DTDs. The most common DTDs used for new web sites these days are HTML 4.01 and XHTML 1.0, although many web sites still use HTML 3.2 as their DTD, while others don't use one at all. Table 4-1 shows a list of common DTDs and the declaration code to be included on the first line of a web page's source code.

Table 4-1. Common DTDs used in new web pages include HTML 4.01 and XHTML 1.0

Name

Web page Code

HTML 3.2

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

HTML 4.01 Strict

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

HTML 4.01 Transitional

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

HTML 4.01 Frameset

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">

XHTML 1.0 Strict

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

XHTML 1.0 Transitional

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

XHTML 1.0 Frameset

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

XHTML 1.1 DTD

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

Among the many changes from HTML Version 3.2 to HTML Version 4.0, the most notable shift came with the W3C's official phase out, or deprecation, of presentation-oriented tags—such as the once ubiquitous <font> tag—in favor of presentation rules defined using Cascading Style Sheets (CSS).

Tip

Web designers suffering from <font> tag withdrawal should note that they can use <span> with a style attribute to do almost anything they used do with <font>. For example, instead of <font color="red"> use <span style="color:red">.

The XHTML specifications further build on HTML 4.0, and also move HTML toward a future marriage with the eXtensible Markup Language (XML), with which web builders can extend the functionality and interactivity of web pages.

A web page with a proper DOCTYPE declaration can be validated using online tools from the W3C (see the "See Also" section in this Recipe). Some web page editors have built-in validators that will check your source code against the W3C's rules.

Tip

A common validation error occurs when a URL in a web page's code contains unencoded ampersands (or other reserved characters defined by the W3C) in a query string, like this:

	<a href="search.php?f=t&arg=doug&p=1&c=0&sr=10&tf=75">More search results</a>

The ampersand characters should be replaced with the HTML entity &amp;, like this:

	<a href="search.php?f=t&amp;arg=doug&amp;p=1&
	amp;c=0&amp;sr=10&amp;tf=75">More search results</a>

The server will decode the request and pass the correct query string—with ampersands instead of entities—to the script or CGI for processing.

If you're new to DTDs or have a lot of legacy pages to check, prepare for lengthy validation reports that point out line by line where your web page code falls short of web standards. Try not to take it personally. (Instead, try a transitional DOCTYPE with a more lenient standard that's more likely to validate.) If you're new to web design, using a validator can be a great way to learn the latest HTML specification and make your pages perfect from the start.

For the most part, a web page with no DOCTYPE tag—or a malformed one—or tags that violate the declared DTD standard will not fail to load or be rendered as unintelligible gibberish. By and large, newer browsers can handle a page with a few invalid tags or DTD violations. They will, however, render the page in what's called "quirks" mode, through a process called a DOCTYPE switch.

Tip

Microsoft's Internet Explorer 5.0 for Mac was the first to do the DOCTYPE switch. Now, all major browsers—Mozilla, Internet Explorer 6.0 for Windows, and Opera 8, among others—ship with a split personality: "standards" mode when the web page correctly declares its DOCTYPE (and follows its rules) and "quirks" mode when it does not.

Basically, quirks mode means the browser reverts to its own notion of how to render HTML code, standards be damned. Each browser has its own rules for entering quirks mode, as well as its own quirks. For more about Internet Explorer, Mozilla, and Opera quirks, see the links in the "See Also" section of this Recipe.

Among the most notable examples of rendering inconsistencies in quirks mode are Internet Explorer's non-standard implementation of padding and margins—the so-called "box model" problem—and Mozilla's imperfect rendering of inline images in table cells.

Standards mode is the ideal, but quirks mode is reality. And add to that the limited time and budget of most web design projects, as well as the particular needs and browsing requirements of a web site's audience, and you'll get a pretty good idea why standard design is not yet part of that reality. Even some of the most well-known sites on the web do not conform to standards. A random sample of ten of the Web's most popular sites, from Amazon to The Weather Channel, found that four don't even declare a DOCTYPE on their home page! A majority of the rest used a transitional DTD, which in some cases can trigger quirks mode as well.

Should you declare a DOCTYPE on your web pages? Yes. Defining a DTD is first step in creating a standards-compliant web site. But ultimately, the proof is in the pudding. Validate your web page markup, but don't consider it a means to an end. Test your pages in common browsers and fix real rendering problems that detract from your audience's ability to use your site.

See Also

For a concrete example of how a web page's DOCTYPE can affect the way it gets rendered, see Recipe 5.3.

The major browser makers all have pages describing their product's handling of the DOCTYPE switch and rendering idiosyncrasies in quirks mode: Mozilla (http://www.mozilla.org/docs/web-developer/quirks/), Internet Explorer (http://msdn.microsoft.com/workshop/author/css/overview/cssenhancements.asp and http://msdn.microsoft.com/workshop/author/dhtml/reference/properties/compatmode.asp), and Opera (http://www.opera.com/docs/specs/doctype/).

World Wide Web Consortium's list of valid DTDs is at http://www.w3.org/QA/2002/04/valid-dtd-list.html and the W3C's HTML validator is at http://validator.w3.org/.

Get Web Site Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.