About Character Encodings

When you type, a computer translates each character into bits. The system it uses for doing this is called a character encoding. The most basic character encoding is called ASCII and has 128 characters: the letters of the English alphabet, the numbers, and some common symbols.

In non-English speaking countries, ASCII is clearly insufficient. Instead, they use slightly larger encodings sometimes encompassing more than one language. In order to maintain compatibility with English, these encodings treat the first 128 characters in the same way as ASCII and assign 128 new characters to positions 129–256. The only problem is that each regional encoding does it in a different way. So, for example, if you want to write in ...

Get HTML, XHTML, & CSS, Sixth Edition: Visual QuickStart Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.