Character Entity References

Characters not found in the normal alphanumeric character set, such as < and &, must be specified in HTML and XHTML documents using character references . This is known as escaping the character. Using the standard desktop publishing keyboard commands (such as Option-G for the © symbol) within an HTML document will not produce the desired character when the document is rendered in a browser. In fact, the browser generally displays the numeric entity for the character.

In (X)HTML documents, escaped characters are indicated by character references that begin with & and end with ;. The character may be referred to by its Numeric Character Reference (NCR) or a predefined character entity name.

A Numeric Character Reference refers to a character by its Unicode code point in either decimal or hexadecimal form (for more information on Unicode and code points, see Chapter 6). Decimal character references use the syntax &# nnnn ;. Hexadecimal values are indicated by an “x”: &# xhhhh ;. For example, the less-than (<) character could be identified as &#60; (decimal) or &#x3C; (hexadecimal).

Character entities are abbreviated names for characters, such as &lt; for the less-than symbol. Character entities are predefined in the DTDs of markup languages such as HTML and XHMTL as a convenience to authors, because they may be easier to remember than Numeric Character References.

Tip

XHTML includes the XML entity declaration for the apostrophe (&apos;). In HTML, the apostrophe ...

Get Web Design in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.