Converting Case

Problem

You want to convert an uppercase string to lowercase or vice versa.

Solution

Use the XSLT translate( ) function. This code, for example, converts from upper- to lowercase:

translate($input,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')

This example converts from lower- to uppercase:

translate($input, 'abcdefghijklmnopqrstuvwxyz','ABCDEFGHIJKLMNOPQRSTUVWXYZ')

Discussion

This recipe is, of course, trivial. However, I include it as an opportunity to discuss the solution’s shortcomings. Case conversion is trivial as long as your text is restricted to a single locale. In English, you rarely, if ever, need to deal with special characters containing accents or other complicated case conversions in which a single character must convert to two characters. The most common example is German, in which the lowercase “ß” is converted to an uppercase “SS”. Many modern programming languages provide case-conversion functions that are sensitive to locale, but XSLT does not support this concept directly. This is unfortunate, considering that XSLT has other features supporting internationalization.

A slight improvement can be made by defining general XML entities for each type conversion, as shown in the following example:

<?xml version="1.0" encoding="UTF-8"?>   
<!DOCTYPE stylesheet [
     <!ENTITY UPPERCASE "ABCDEFGHIJKLMNOPQRSTUVWXYZ">
     <!ENTITY LOWERCASE "abcdefghijklmnopqrstuvwxyz">
     <!ENTITY UPPER_TO_LOWER " '&UPPERCASE;' , '&LOWERCASE;' ">
     <!ENTITY LOWER_TO_UPPER " '&LOWERCASE;' , '&UPPERCASE;' ">
]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
   
     <xsl:template match="/">
     <xsl:variable name="test"
          select=" 'The rain in Spain falls mainly in the plain' "/>
     <output>
          <lowercase>
               <xsl:value-of
                    select="translate($test,&UPPER_TO_LOWER;)"/>
          </lowercase>
          <uppercase>
               <xsl:value-of
                    select="translate($test,&LOWER_TO_UPPER;)"/>
          </uppercase>
     </output>
     </xsl:template>
   
</xsl:stylesheet>

These entity definitions accomplish three things. First, they make it easier to port the stylesheet to another locale because only the definition of the entities UPPERCASE and LOWERCASE need be changed. Second, they compact the code by eliminating the need to list all letters of the alphabet twice. Third, they make the intent of the translate call obvious to someone inspecting the code. Some purists might object to the macro-izing away of translate( )’s third parameter, but I like the way it makes the code read. If you prefer to err on the pure side, then use translate($test, &UPPERCASE;, &LOWERCASE;).

I have not seen entities used very often in other XSLT books; however, I believe the technique has merit. In fact, one benefit of XSLT being written in XML syntax is that you can exploit all features of XML, and entity definition is certainly a useful one. If you intend to use this technique and plan to write more than a few stylesheets, then consider placing common entity definitions in an external file and include them as shown in Example 1-6.

Example 1-6. Standard.ent

<!ENTITY UPPERCASE "ABCDEFGHIJKLMNOPQRSTUVWXYZ">   
<!ENTITY LOWERCASE "abcdefghijklmnopqrstuvwxyz">
<!ENTITY UPPER_TO_LOWER " '&UPPERCASE;' , '&LOWERCASE;' ">
<!ENTITY LOWER_TO_UPPER " '&LOWERCASE;' , '&UPPERCASE;' ">
<!-- others... -->

Then use a parameter entity defined in terms of the external standard.ent file, as shown in Example 1-7.

Example 1-7. A stylesheet using standard.ent

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE stylesheet [
     <!ENTITY % standard SYSTEM "standard.ent">
     %standard;
]>
<xsl:stylesheet version="1.0" 
<!-- ... -->
</xsl:stylesheet>

Steve Ball’s implementation of case conversion works in virtually all cases by including all the most common Unicode characters in the upper- and lowercase strings and taking special care to handle the German ß (“eszett”) correctly.

See Also

Steve Ball’s solution is available in the “Standard XSLT Library” at http://xsltsl.sourceforge.net/.

Get XSLT Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.