Using ABNF

My advice when writing protocol specs is to learn, and use, a formal grammar. It’s just less hassle than allowing others to interpret what you mean, and then recover from the inevitable false assumptions. The target of your grammar is other people: engineers, not compilers.

My favorite grammar is ABNF, as defined by RFC 2234, because it is probably the simplest and most widely used formal language for defining bidirectional communications protocols. Most IETF (Internet Engineering Task Force) specifications use ABNF, which is good company to be in.

I’ll give a 30-second crash course in writing ABNF here. It may remind you of regular expressions. You write the grammar as rules. Each rule takes the form “name = elements”. An element can be another rule (which you define below as another rule), or a pre-defined “terminal” (like CRLF, OCTET), or a number. The RFC lists all the terminals. To define alternative elements, use “element / element”. To define repetition, use “*” (read the RFC, because it’s not intuitive). To group elements, use parentheses.

I’m not sure if this extension is proper, but I then prefix elements with “C:” and “S:” to indicate whether they come from the client or server.

Here’s a piece of ABNF for an unprotocol called NOM that we’ll come back to later in this chapter:

nom-protocol = open-peering *use-peering open-peering = C:OHAI ( S:OHAI-OK / S:WTF ) use-peering = C:ICANHAZ / S:CHEEZBURGER / C:HUGZ S:HUGZ-OK / S:HUGZ ...

Get ZeroMQ now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.