Unicode Normalization

The other category of Unicode-to-Unicode transformations we need to consider is Unicode normalization. You may or may not need to care about Unicode normalization as such; often it happens as a by-product of some other process. For example, character code conversion is usually written to produce normalized Unicode text; a legacy encoding naturally converts to one normalized form or the other in most cases. (Of course, not all converters available on a particular system may produce the same normalized form, unless they've all been designed to do so.) Similarly, keyboard layouts and input methods can be designed to produce normalized text, usually without any extra work; again this might mean that different input methods and ...

Get Unicode Demystified now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.