Posted on by & filed under Digital Publishing.

Author: Keith Fahlgren, Director of Engineering at Safari Books Online

Now that Threepress has joined Safari Books Online, we’ve been given access to (and responsibility for shepherding) a huge number of EPUBs. This has made me painfully aware of the current suite of tools and techniques available to both publishers and businesses working with EPUBs.

EpubCheck is the best tool for validating EPUB (and now EPUB 3) documents, especially now that the IDPF and DAISY Consortium are actively sponsoring its development. But many businesses have a range of unique preferences and business rules for EPUB documents that go beyond a strict validity test. To help address that, I’ve started work on a project called nort.

Rather than waiting for a polished version (that might never come), I’m releasing the first version of nort when it does remarkably little: run extra layers of validation on the OPF file inside an EPUB and report on the results.

The extra requirements are specified using ISO Schematron files, which offer a way to codify machine-readable “rules.” The advantage of Schematron is that it combines actually human-readable business rules in plain text with a bit of XPath. Properly written, Schematron can give much more intelligible output than other validation techniques. Writing new rules will require access to someone comfortable with XML, but having them sit down with someone comfortable with the business to translate sentences like “Every EPUB must include a cover image file” into XPath is usually straightforward (and rewarding). In fact, here’s that rule:

The above rule asserts the presence of a particular element in the metadata. In some situations, it makes more sense just to report on what is there:

nort is not a replacement for EpubCheck. Your EPUB files must be valid according to EpubCheck before using nort. nort is good for specifying or testing the particular preferences you have in addition to basic validity.


  • nort only supports XPath/XSLT 1.0 inside Schematron documents
  • As of version 0.1, nort only validates the OPF document
  • As of version 0.1, nort only validates EPUB, not EPUB 3.0 documents
  • nort requires lxml 2.3, which can be hard to install. Good luck!

If you have problems or think the tool would be worth extending in a particular way, please submit an issue or a pull request.

A complete version of the Schematron file from above

About the Author

Keith Fahlgren, Director of Engineering, Safari Books Online

Keith Fahlgren has deep experience in publishing technology, particularly in the area of digital content readability. Keith played a lead role in integrating Ibis Reader with existing platforms and helping publishers create digital content more effectively. His varied contributions to the digital publishing ecosystem include a number of open-source EPUB tools. Keith has spoken widely, was the co-founder of Ibis Reader, and was formerly at O’Reilly Media, where he helped design and implement many of their digital publishing workflows.

Tags: EPUB, Python, tools,

Comments are closed.