WordprocessingML’s Style of Markup

If you have any XML or HTML markup background, then WordprocessingML’s style of markup may surprise you. WordprocessingML was not designed from a clean slate for the purpose of creating documents in XML markup. Instead, it is an unveiling of the internal structures that have been present in Microsoft Word for years. Though certain features have been added to make WordprocessingML usable outside the context of Word, by and large it represents a serialization of Word’s internal data structures: various kinds of objects associated with myriad property values. Indeed, the object-oriented term “properties” permeates the WordprocessingML schema. If you want to make a run of text bold, you set the bold property. If you want to indent a particular paragraph, you set its indentation property. And so on.

No Mixed Content

Mixed content describes the presence of text content and elements inside the same parent element. It is standard fare in the world of markup, especially when using document-oriented markup. For example, in HTML, to make a sentence bold and only partially italicized, you would use code such as the following:

<b>This sentence has <i>mixed</i> formatting.</b>

WordprocessingML, however, never uses mixed content. All of the text in a WordprocessingML document resides in w:t elements, and w:t elements can only contain text (and no elements). The above sentence is represented much differently in WordprocessingML. The hierarchy is flattened into ...

Get Office 2003 XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.