Appendix E

Character Encodings

Appendix D, “Color Names and Values,” discusses how computers store information, how a character-encoding scheme is a table that translates between characters, and how they are stored in the computer.

The most common character set (or character encoding) in use on computers is The American Standard Code for Information Interchange (ASCII), which is probably the most widely used character set for encoding text electronically. You can expect all computers browsing the web to understand ASCII.

The problem with ASCII is that it supports only the uppercase and lowercase Latin alphabet, the numbers 0–9, and some extra characters: a total of 128 characters. Table E-1 lists the printable characters of ASCII. (The other characters are things such as line feeds and carriage-return characters.)

Table E-1: Printable Characters of ASCII

Table bapp05-01

However, many languages use either accented Latin characters or completely different alphabets. ASCII does not address these characters, so you need to learn about character encodings if you want to use any non-ASCII characters.

Character encodings are also important if you want to use symbols because these cannot be guaranteed to transfer properly between different encodings (from some dashes to some quotation mark characters). If you do not indicate the character encoding the document is written in, some of the special characters ...

Get Beginning HTML and CSS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.