Heartbeating

Heartbeating solves the problem of knowing whether a peer is alive or dead. This is not an issue specific to ØMQ. TCP has a long timeout (30 minutes or so), which means that it can be impossible to know whether a peer has died, been disconnected, or gone on a weekend trip to Prague with a case of vodka, a redhead, and a large expense account.

It’s not easy to get heartbeating right. When writing the Paranoid Pirate examples, it took me about five hours to get the heartbeating working properly. The rest of the request-reply chain took perhaps 10 minutes. It is especially easy to create “false failures”; i.e., when peers decide that they are disconnected because the heartbeats aren’t sent properly.

In this section, we’ll look at the three main solutions people use for heartbeating with ØMQ.

Get ZeroMQ now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.