O'Reilly logo

XSLT, 2nd Edition by Doug Tidwell

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

[2.0] The unparsed-text() and unparsed-text-available() Functions

The last new function for combining documents is the unparsed-text() function. This lets you read in text from a URL. That text is not parsed, letting you read in text documents, comma-separated values, or even HTML documents that aren’t well-formed XML. What’s more, you can combine unparsed-text() with other new features such as the tokenize() function or the <xsl:analyze-string> element to process that text and transform it in a useful way.

As an example, we’ll read in a file of comma-separated values and output them as an HTML table of addresses. Here’s the comma-separated file, unparsed-text.csv:

Mr.,Chester Hasbrouck,Frisby,1234 Main Street,Sheboygan,WI,48392
Ms.,Natalie,Attired,707 Breitling Way,Winter Harbor,ME,00218
Ms.,Amanda,Reckonwith,930-A Chestnut Street,Lynn,MA,02930
Mrs.,Mary,Backstayge,283 First Avenue,Skunk Haven,MA,02718

We’ll go through three simple steps to process this data. First, we’ll use the tokenize() function to get each line of the file. Next, we’ll use tokenize() to get each comma-separated value. Finally, we’ll take each value and transform it appropriately. Using the comma-separated file we’ve listed here, the third comma-separated value in each line is the customer’s last name, the seventh value is the zip code, and so forth.

To process the file one line at a time, we’ll use this technique, courtesy of the XSLT 2.0 spec:

<xsl:for-each select="tokenize(unparsed-text('addresses.csv'), '\r?\n')"> ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required