Some Ideas to Make Disambiguation Easier

To close this chapter, I’ll present some ideas that should ease the challenge of disambiguating schemas.

Generalizing the Except Pattern

In the different forms of ambiguity, name classes have been the easiest ones to disambiguate. Why is this? Name classes aren’t inherently simpler than regular expressions or datatypes. All these tools are about defining sets of things that can happen in XML documents and in many ways, they are deeply similar. The reason that name classes and datatypes have been easier to disambiguate is because they have a first class except operator. If you had the same level of support for patterns and datatypes, you could more easily disambiguate them.

It is possible to apply the except pattern to datatypes and write:

element foo{ (xsd:boolean - xsd:integer) |xsd:integer}

A value that is only integer will obviously match only the right alternative. A value that is exclusively boolean (true or false) matches the left alternative. A value that is both a boolean and an integer (0 or 1) matches the first condition of the left alternative (xsd:boolean) but doesn’t match the exception clause.

Unfortunately, this rule can’t be generalized beyond the scope of data patterns. (Note that the examples given next with the except (-) operator aren’t valid RELAX NG.)

If this rule could be generalized, and applied to an ambiguous regular expression such as:

 two|(one?,two+,three*)

you could write:

 two|((one?,two+,three*)-two)

Of course, this same ...

Get RELAX NG now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.