Character sets summary

A large number of character sets and encodings have been discussed here. But the main point to recall is that most of these standards overlap each other:

It should be noted, however, that ISO 8859/1 does not fit comfortably into this picture. It is not possible to 'pretend' that an ISO 8859/1 format file is in fact one of these other standards (even Unicode, because it uses just one byte for each character). It is unfortunate that the encoding scheme that is the most popular in the early days of XML is not a default XML format.

It should be recalled that all XML processors must accept UTF-8 and UTF-16 formats as standard. ...

Get XML Companion, The, Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.