Chapter 4. Fault tolerance

In this chapter

  • Building self-healing systems
  • Understanding the let-it-crash principle
  • Understanding the actor lifecycle
  • Supervising actors
  • Choosing fault recovery strategies

This chapter covers Akka’s tools for making applications more resilient. The first section describes the let-it-crash principle, including supervision, monitoring, and actor lifecycle features. Of course, we’ll look at some examples that show how to apply these to typical failure scenarios.

4.1. What fault tolerance is (and what it isn’t)

Let’s start with a definition of what we mean when we say a system is fault tolerant, and why you’d write code to embrace the notion of failure. In an ideal world, a system is always available and can ...

Get Akka in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.