4.3. UTF-8 Support

The LDAP V3 protocol defines that string values use UTF-8 (see Figure 20. below). As the name implies, UTF-8 (UCS Transformation Format) is a transformation of Unicode characters into a byte representation that takes up from one to eight bytes per character. As such, UTF-8 gives a clearly defined conversion to and from Unicode while, on the other hand, maintaining compatibility with most existing character encoding, such as ASCII.

Figure 20. Data passes in UTF-8 character set

Passing data in UTF-8 simplifies and standardizes the handling of national language characters in the protocol and on the server. The client libraries ...

Get LDAP Implementation Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.