Multi-Byte Character Sets

Most programmers are accustomed to working with single-byte character sets. In the U.S., we like to pretend that ASCII is the only meaningful mapping between characters and numbers. This is not the case. Standard organizations such as ANSI (American National Standards Institute) and the ISO (International Standards Organization) have defined many different encodings that associate a unique number with each character in a given character set. Theoretically, a single-byte character set can encode 256 different characters. In practice, however, most single-byte character sets are limited to about 96 visible characters. The range of values is cut in half by the fact that the most significant bit is sometimes considered off-limits ...

Get PostgreSQL, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.