General Character Properties

Each character has a set of properties that serve to identify the character. These include the name, Unicode 1.0 name, Jamo short name, ISO 10646 comment, block, and script.

Standard Character Names

First among these properties, of course, is the character's name, which is given both in the book and in the UnicodeData.txt file. The name is always in English, and the only legal characters for the name are the 26 Latin capital letters, the 10 Western digits, and the hyphen. The name is important, as it's the primary guide to just what character is meant by the code point. The names generally follow some conventions:

  • For those characters that belong to a particular script (writing system), the script name is included ...

Get Unicode Demystified now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.