Text Declarations

XML documents may be composed of multiple parsed entities, as you learned in Chapter 3. These external parsed entities may be DTD fragments or chunks of XML that will be inserted into the master document using external general entity references. In either case, the external parsed entity does not necessarily use the same character set as the master document. Indeed, one external parsed entity may be referenced in several different files, each of which is written in a different character set. Therefore, it is important to specify the character set for an external parsed entity independently of the character set that the including document uses.

To accomplish this task, each external parsed entity should have a text declaration. If present, the text declaration must be the very first thing in the external parsed entity. For example, this text declaration says that the entity is encoded in the KOI8-R character set:

<?xml version="1.0" encoding="KOI8-R"?>

The text declaration looks like an XML declaration. It has version info and an encoding declaration. However, a text declaration must not have a standalone declaration. Furthermore, the version information may be omitted. A legal text declaration that specifies the encoding as KOI8-R might look like this:

<?xml encoding="KOI8-R"?>

However, this is not a legal XML declaration.

Example 5-1 shows an external parsed entity containing several verses from Pushkin’s The Bronze Horseman in a Cyrillic script. The text declaration ...

Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.