O'Reilly logo

JavaScript Cookbook by Shelley Powers

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

19.7. Glean Page RDFa and Convert It into JSON Using rdfQuery and the jQuery RDF Plug-in

Problem

You’re using Drupal 7, a Content Management System (CMS) that annotates the page metadata with RDFa—Resource Description Framework (RDF) embedded into X/HTML. Here’s an example of the type of data in the page (from the RDFa specification):

<h1>Biblio description</h1>
<dl about="http://www.w3.org/TR/2004/REC-rdf-mt-20040210/"
id="biblio">
  <dt>Title</dt>
   <dd property="dc:title">
RDF Semantics - W3C Recommendation 10 February 2004</dd>
  <dt>Author</dt>
   <dd rel="dc:creator" href="#a1">
    <span id="a1">
      <link rel="rdf:type" href="[foaf:Person]" />
      <span property="foaf:name">Patrick Hayes</span>
      see <a rel="foaf:homepage"
href="http://www.ihmc.us/users/user.php?UserID=42">homepage</a>
    </span>
   </dd>
</dl>

You want to convert that RDFa formatted data into a JavaScript object, and eventually into JSON for an Ajax call.

Microformat hCalendar application in IE

Figure 19-2. Microformat hCalendar application in IE

Solution

Use one of the RDFa JavaScript libraries, such as rdfQuery, which has the added advantage of being built on jQuery (the default JavaScript library used with Drupal). The rdfQuery library also implements an RDFa gleaner, which is functionality that can take a jQuery object and glean all of the RDFa from it and its subtree, automatically converting the data into RDF triples and storing them into an in-memory database:

var triplestore =  $('#biblio').rdf()
  .base('http://burningbird.net')
  .prefix('rdf','http://www.w3.org/1999/02/22-rdf-synax-ns#')
  .prefix('dc','http://purl.org/dc/elements/1.1/')
  .prefix('foaf','http://xmlns.com/foaf/0.1/');

Once you have the data, you can export a JavaScript object of the triples:

var data = triplestore.databank.dump();
And then you can convert that into JSON:
var jsonStr = JSON.stringify(d);

Discussion

RDF is a way of recording metadata in such a way that data from one site can be safely combined with data from many others, and queried for specific information or used in rules-based derivations. The data is stored in a format known as a triple, which is nothing more than a simple subject-predicate-object set usually displayed as:

<http://www.example.org/jo/blog> foaf:primaryTopic <#bbq> .
<http://www.example.org/jo/blog> dc:creator "Jo" .

These triples basically say that the subject in this, a blog identified by a specific URL, has a primary topic of “bbq,” or barbecue, and the creator is named Jo.

This is a book on JavaScript, so I don’t want to spend more time on RDFa or RDF. I’ll provide links later where you can get more information on both. For now, just be aware that we’re going to take that RDFa annotation in the page, convert it into a triple store using rdfQuery, and then export it as a JavaScript object, and eventually JSON.

The RDFa, embedded into X/HTML, has the opposite challenges from Microformats: the syntax is very regular and well-defined, but accessing the data can be quite challenging. That’s the primary reason to use a library such as rdfQuery.

In the solution, what the code does is use jQuery selector notation to access an element identified by “biblio”, and then use the .rdf() gleaner to extract all of the RDFa out of the object and its subtree and store it in an in-memory data store.

The solution then maps the prefixes for the RDFa: dc is mapped to http://purl.org/dc/elements/1.1/, and so on. Once these two actions are finished, a dump of the store creates a JavaScript object containing the triple objects extracted from the RDFa, which are then converted into JSON using the JSON.stringify method. The resulting string with the five derived triples looks like this:

{"http://www.w3.org/TR/2004/REC-rdf-mt-
20040210/":{"http://purl.org/dc/elements/1.1/title":[{"type":"literal",
"value":"RDF Semantics - W3C Recommendation 10 February
2004"}],"http://purl.org/dc/elements/1.1/creator":[{"type":"uri",
"value":
"http://burningbird.net/jscb/data/rdfa.xhtml#a1"}]},
"http://burningbird.net/jscb/data/rdfa.xhtml#a1":
{"http://www.w3.org/1999/02/22-rdf-syntax-
ns#type":[{"type":"uri","value":"http://xmlns.com/foaf/0.1/Person"}],
"http://xmlns.com/foaf/0.1/name":[{"type":"literal","value":"Patrick
Hayes"}],"http://xmlns.com/foaf/0.1/homepage":[{"type":"uri",
"value":"http://www.ihmc.us/users/user.php?UserID=42"}]}}

Which converts into Turtle notation as:

<http://www.w3.org/TR/2004/REC-rdf-mt-20040210/> <http://purl.org/dc/elements/1.1/title>
 "RDF Semantics - W3C Recommendation 10 February 2004" .
<http://www.w3.org/TR/2004/REC-rdf-mt-20040210/>
<http://purl.org/dc/elements/1.1/creator>
<http://burningbird.net/jscb/data/rdfa.xhtml#a1> .
<http://burningbird.net/jscb/data/rdfa.xhtml#a1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://xmlns.com/foaf/0.1/Person> .
<http://burningbird.net/jscb/data/rdfa.xhtml#a1>
<http://xmlns.com/foaf/0.1/name> "Patrick Hayes" .
<http://burningbird.net/jscb/data/rdfa.xhtml#a1>
<http://xmlns.com/foaf/0.1/homepage>
<http://www.ihmc.us/users/user.php?UserID=42> .

Once you have the string, you can use it in an Ajax call to a web service that makes use of RDF or JSON, or both.

Example 19-4 combines the pieces of the solution into a full-page application in order to more fully demonstrate how each of the components works together. The application prints out the JSON.stringify data dump of the data and then prints out each trip individually, converting the angle brackets of the triples first so that appending them to the page won’t trigger an XHTML parsing error.

Example 19-4. Extracting RDFa from a page and embedding the data into the page

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:dc="http://purl.org/dc/elements/1.1/"
      xmlns:foaf="http://xmlns.com/foaf/0.1/" >
  <head profile="http://ns.inria.fr/grddl/rdfa/">
    <title>Biblio description</title>
<style type="text/css">
div { margin: 20px; }
</style>
  <script type="text/javascript" src="json2.js"></script>
  <script type="text/javascript" src="jquery.js"></script>
  <script type="text/javascript"
src="jquery.rdfquery.rdfa.min-1.0.js"></script>
  <script type="text/javascript">
  //<![CDATA[

    window.onload = function() {

      var j =  $('#biblio').rdf()
        .base('http://burningbird.net')
        .prefix('rdf','http://www.w3.org/1999/02/22-rdf-synax-ns#')
        .prefix('dc','http://purl.org/dc/elements/1.1/')
        .prefix('foaf','http://xmlns.com/foaf/0.1/');

       var d = j.databank.dump();
       var str = JSON.stringify(d);
       document.getElementById("result1").innerHTML = str;

       var t = j.databank.triples();
       var str2 = "";
       for (var i = 0; i < t.length; i++) {
         str2 =
str2 + t[i].toString().replace(/</g,"&lt;").replace(/>/g,"&gt;")
+ "<br />";
       }
       document.getElementById("result2").innerHTML = str2;
    }
  //]]>
  </script>
  </head>
  <body>
    <h1>Biblio description</h1>
    <dl about="http://www.w3.org/TR/2004/REC-rdf-mt-20040210/"
id="biblio">
      <dt>Title</dt>
       <dd property="dc:title">
RDF Semantics - W3C Recommendation 10 February 2004</dd>
      <dt>Author</dt>
       <dd rel="dc:creator" href="#a1">
        <span id="a1">
          <link rel="rdf:type" href="[foaf:Person]" />
          <span property="foaf:name">Patrick Hayes</span>
          see <a rel="foaf:homepage"
href="http://www.ihmc.us/users/user.php?UserID=42">homepage</a>
        </span>
       </dd>
    </dl>
    <div id="result1"></div>
    <div id="result2"></div>
  </body>
</html>

Figure 19-3 shows the page after the JavaScript has finished. The application uses the json2.js library for browsers that haven’t implemented the JSON object yet.

You can also do a host of other things with rdfQuery, such as add triples directly, query across the triples, make inferences, and anything else you would like to do with RDF.

See Also

rdfQuery was created by Jeni Tennison. You can download it and read more documentation on its use at http://code.google.com/p/rdfquery/. When I used the library for writing this section, I used it with jQuery 1.42. Another RDFa library is the RDFa Parsing Module for the Backplane library.

For more information on RDF, see the RDF Primer. The RDFa Primer can be found at http://www.w3.org/TR/xhtml-rdfa-primer/. There is a new effort to create an RDFa-in-HTML specification, specifically for HTML5.

Running the RDFa extraction application in Opera

Figure 19-3. Running the RDFa extraction application in Opera

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required