Diacritical Marks

One of the principles underlying Unicode is the idea of dynamic composition—you can represent the marked form of a letter using two code points: one representing the letter, followed by another one representing the mark. Quite a few of the letters in the Latin blocks are marked forms—base letters with some kind of mark applied. All of these characters can be represented using two code points. To make this possible, Unicode includes a whole block of characters—the Combining Diacritical Marks block, which runs from U+0300 to U+036F. The characters in this block are special because they specifically have combining semantics. They always modify the character that precedes them in storage. As a consequence, these characters are generally ...

Get Unicode Demystified now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.