When to Use String Datatypes

It’s amazing to think that despite all the complex applications that have been made possible by SGML and XML, whitespace processing—which seems as if it should be simple—has remained a nightmare for users and programmers. The string datatype will expose you to all the issues related to whitespace handling. A huge number of users and applications will modify whitespace in your documents to meet their expectations, which can make your documents invalid.

The token datatype keeps this nightmare from creating problems, and that is why RELAX NG uses token as its default datatype. Keep in mind that you shouldn’t use the string datatype unless you have a good reason to do so. If whitespace is genuinely significant to your information, use the string type; otherwise, use the token type.

Get RELAX NG now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.