O'Reilly logo

Programming Social Applications by Jonathan LeBlanc

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Running Caja from a Web Application

We’ve seen how to take a mixed HTML and JavaScript document and cajole it into two files made up of the sanitized markup and cajoled JavaScript of the original code. Taking that knowledge as our base, we’ll now explore how to cajole content from a web source.

The SVN source that we obtained for Caja includes a sanitization JavaScript file that will allow us to run a cajoling function against some provided web content. The file is located at src/com/google/caja/plugin/html-sanitizer.js within the caja directory.

The other file we will need is a whitelist of all of the available HTML tags, which the sanitizer will use to determine which tags should be left alone, which should be sanitized, and which should be removed completely. A sample file (html4-defs.js) with this type of structure is available at https://github.com/jcleblanc/programming-social-applications/tree/master/caja/web_sanitizer_simple/ and provides an aggressive parsing whitelist that we will use in our example.

With these two files in hand, we can begin building out the markup and JavaScript to create a simple parsing mechanism:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
                      "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Simple Web Application Cajoler</title>
</head>
<body>
<script src="html4-defs.js"></script> ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required