On-the-fly schema validation as the user edits the document
Schema-driven editing functionality
Schema validation happens automatically as a user edits the document.
If a particular element declared in an attached schema is present in
the document and does not conform to the type defined in the schema,
then Word will flag this as an error. We’ve seen
examples of this in our press release example, for certain simple
types such as
Schema-driven editing functionality is exposed through the XML Structure task pane (covered below) and the Document Actions task pane (covered in Chapter 5).
The Word UI allows you to manually attach schemas to the currently open document. Figure 4-17 shows the appropriate dialog, which you can access by selecting Tools → Templates and Add-Ins → XML Schema.
The “Available XML schemas” list contains the aliases for all of the schemas in the schema library. In this example, the Press Release checkbox is checked, which means that the press release schema is attached to the current document. Multiple schemas can be attached to the same document, just as elements from multiple namespaces can be used in the same XML document.
The Add Schema... button lets you browse for an XSD schema document file in order to add it to your machine’s schema library. By default, it also attaches the schema to the document—automatically checking the corresponding checkbox that newly appears in the “Available XML schemas” list. The Schema Library button opens the Schema Library dialog, which we looked at earlier.
If all you ever do is manually attach schemas through the Word UI, the process of “schema attachment” may seem a little mysterious. The first thing to do is to stop thinking of it as a process. Instead, think of it as a property of the underlying WordprocessingML document. Secondly, it’s important to understand that Word treats namespaces and schemas as virtually synonymous. That a “schema is attached” to a document means nothing more than the fact that a non-WordprocessingML namespace declaration is present somewhere inside the WordprocessingML document. A “non-WordprocessingML namespace declaration” is a declaration for any namespace other than the namespaces reserved for Word that were introduced in Chapter 2. So when Word says that a schema is attached to a document, it really means that a namespace is attached.
The fact that a schema is attached to the document is independent of whether a corresponding schema library entry is present on the current user’s machine. It doesn’t even matter if the document contains an element or attribute that uses the namespace.
Example 4-6 shows a simple WordprocessingML document with a schema attached, i.e., with a namespace declaration that is not among one of Word’s reserved namespaces.
Example 4-6. A WordprocessingML document with a “schema attached”
<?xml version="1.0"?> <?mso-application progid="Word.Document"?> <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:foo="http://xmlportfolio.com/pressRelease"> <w:body/> </w:wordDocument>
If someone in our imaginary PR department opened this document in Word and selected Tools → Templates and Add-Ins . . . → XML Schema, they would see something very similar to the dialog box we saw in Figure 4-8 (assuming they already have the Press Release schema in their schema library). Specifically, the Press Release checkbox would be checked. As far as Word is concerned, the mere presence of the namespace declaration (anywhere in the document) means that the schema is attached, regardless even of whether any elements or attributes in the document use the namespace.
What happens if the user doesn’t have a corresponding schema library entry? In that case, the schema is no less attached, because we’ve defined “schema attachment” as the presence of a non-WordprocessingML namespace declaration. However, in this case, the attached schema would be considered “unavailable.” Figure 4-18 shows how the Word UI handles this scenario.
As you can see, a checkbox is still checked, meaning that “a schema is attached.” The only difference is that, since there is no corresponding schema library entry, this schema is considered to be “Unavailable.” And without a corresponding XSD schema document, schema validation and schema-driven editing are not possible.
Thus, for schema validation to work correctly, two conditions must hold:
The schema must be attached (the namespace must be declared in the document)
The schema must be available (in the machine’s schema library).
Now let’s relate all of this back to our primary use case—using Word as an XML editor. If you recall the basic processing model, the first thing that happens when Word opens an arbitrary XML document is that an XSLT stylesheet is applied to it, converting it to WordprocessingML. Even though the schema library is consulted to see which XSLT stylesheet to apply (based on the namespace of the document’s root element), no schemas have been attached at this point.
Whether a schema is ultimately attached to the document that the user
edits is completely determined by whether the result of the
transformation includes any non-WordprocessingML namespace
declarations. Of course, if the result document contains any custom
XML elements in your schema’s namespace, then the
schema will de facto be attached (because you
can’t have an element without declaring its
namespace). And since schema validation is usually only useful when
custom XML elements are already present, schema attachment is usually
an automatic thing you don’t have to think about; it
just happens. Even so, understanding how it
works is helpful for debugging and for explaining where unwanted
schemas come from—namely, wayward namespace declarations in the
result of the
onload transformation. (The
onload XSLT stylesheets will therefore often use
extension-element-prefixes attributes to prevent
unwanted namespace declarations appearing in the WordprocessingML