Posted on by & filed under Digital Publishing, html5, xslt.

The World Wide Web Consortium (W3C) is a standards organization serving the “open web” — the set of freely available specifications that underpin most of the visible internet. In the years since the W3C was founded, all modern businesses have become “web” businesses, with their own industry-specific processes, jargon, and priorities. To that end, the W3C has formed interest groups for those industries which are adjacent to the web, with a goal to promote web technologies and ensure that the web is meeting common commercial needs.

I was co-chair for the Digital Publishing Interest Group for a time, and I have first-hand exposure to their work in interviewing publishers, documenting best practices, and writing recommendations for future specifications.

Screen shot of the first table of the DPUB specification review

One of those deliverables is an intimidating table of W3C specifications and standards that were considered relevant to digital publishing. There’s a lot to digest there, and it’s unlikely that any single human is deeply familiar with all of it. I’ve provided an opinionated gloss of the most relevant or active standards, and feel free to comment if I’ve disparaged or ignored your favorite specification.

The audience

I’m assuming that the reader is one of the following:

  • A developer who is working in digital publishing
  • A curious non-developer who isn’t afraid of the word “normative” and acronyms that begin with ‘X’
  • A standards wonk who wants to be more familiar with publishing activity

These are the “bread and butter” of digital publishing — whether it’s commercial ebooks, academic publishing, or journals:


HTML5 is a monster of a spec, but at least it’s reflective of current browser support. You should be familiar with the basics of markup, as well as the sections on browsers and common APIs.


There’s the workhorse CSS 2.1 specification which has been around for a decade. Unfortunately for the curious but lazy, all the cool new stuff is in CSS3, and that spec is broken out into many modules. Here’s a drive-by of the most interesting or publishing-relevant ones:

  • Start with Dave Cramer’s highly readable Requirements for Latin Text Layout and Pagination (“Latin” here means Western languages, not veni, vidi, vici). Note that this is a requirements document, not a spec, which means much of what Dave recommends won’t actually work anywhere yet. Welcome to standards!
  • CSS Text Module Level 3 is the “real world” equivalent to the above. Though it’s technical a spec in-progress, most everything in here is available in modern browsers and reading systems.
  • CSS Regions Module Level 1 is a good read when you want to be angry about something. Regions can do some amazing things for advanced layout, but there’s a long and sordid history behind their implementation and deployment. There’s a lot of momentum behind getting Regions or an equivalent standard moving again, so there’s hope.

Extra credit assignments: CSS Media Queries and CSS Fonts Module Level 3. And while it’s unlikely that you’d need to actually read the SVG and MathML specs, it’s important to be familiar with those formats at a high level.


The simplest way to approach accessible web or ebook content is to study the semantics that are built in to HTML5. High-quality semantic markup will not only help a range of human users, it’ll aid in discovery and ranking by search engines.

Follow that up with the non-technical best practices in Web Content Accessibility Guidelines, and this overview of creating accessible interactive content.


It’s not dead yet! There’s a lot of cruft in the list, but ebooks are still required to be well-formed XML documents, and academic publishing remains dominated by XML (and, sigh, PDF).

Bleeding edge

If everything above is old hat, check out the emerging specs on the Shadow DOMCSS Flexible Box Layout Module Level 1 (flexbox), and Packaging on the Web.

Tags: CSS3, digital publishing, dpub, ebooks, EPUB, html5, specifications, standards, w3c, xmlwtfbbq,

4 Responses to “An opinionated guide to digital publishing specifications”

  1. Bill Kasdorf

    This is so fantastic, Liza–thanks!

    One question: you don’t mention Schematron. (Maybe in conjunction with XSLT?) That sure is proving to be indispensable, imo.


    P.S. You’ll note that I didn’t mention metadata. There’s a reason for that. ;-)

    • Jean Kaplansky

      Bill – Schematron is actually an ISO spec, which may be why Liza didn’t include it the the W3C pool. [ISO Schematron of 2006 (ISO/IEC FDIS 19757-3)]

      Liza – thanks for a great intro! You’re right on target about these specs being intimidating to the uninitiated. This article is a great place for people to start! I may start using this as a laundry list when I need to teach people how to swim in the W3C “pool.”

      • Bill Kasdorf

        Good point. Another thing people don’t realize (and I forgot in this instance, though I point this out to people all the time!) is that not all of the Open Web Platform is controlled by W3C. Big chunks–Schematron, as you point out, plus JavaScript,, etc.–are not W3C specs. Which is how we get to consider EPUB 3 (IDPF) as part of the OWP, which I do!