Character Sets and Codings

Most people are familiar with the ASCII character set. This 7-bit character set represents the upper- and lowercase letters a-z, digits 0-9, control characters, and special characters. Although this character set is sufficient for most English-language programs, it is not sufficient for programs in other languages, such as Spanish and German, which require special accent marks, and Japanese, which requires a whole new character set.

Unlike earlier programming languages, Java has provided comprehensive support for the Unicode 2.0 character set from the beginning.

Unicode is a 16-bit character set, meaning that it is capable of representing 65,536 characters. This is a large character set; it can be used to represent ...

Get Sun Certification Training Guide (310-025, 310-027): Java™ 2 Programmer and Developer Exams now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.