UNICODE, MICROSOFT, AND “SMART QUOTES”

Although Unicode is now the official HTML character set, there are still legacy issues to be aware of, such as the mismatch between Unicode and Codepage 1252. Microsoft created Codepage 1252 as an alternative to Latin 1. In doing so, it added characters that included, among other things, the curly quotes popularly referred to as “smart quotes.” The full list of characters that Microsoft added in the 128–159 code point range are shown in Table C.4.

Table C.4. The “Special Characters” of Codepage 1252
Character Numeric Entity
€
‚
ƒ ƒ
„
…
†
‡
^ ˆ
‰
Š Š

‹
Œ Œ
Ž Ž
‘
’
“
”
• ...

Get Beyond Borders: Web Globalization Strategies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.