O'Reilly logo

Building Real-Time Data Pipelines by Steven Camina, Kevin White, Conor Doherty, Gary Orenstein

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 8. Data Persistence and Availability

Fundamental to any operational database is its ability to store information durably and be resilient to unexpected machine failures. In more technical terms, an operational database must:

  • Persist all its information to disk storage for durability.
  • Ensure data is highly available by maintaining a readily available second copy of all data, and automatically failover without downtime in case of server crashes.

The previous chapters have been touting the ability of in-memory, distributed, SQL-based (relational) databases to provide the fastest performance for a wide amount of use cases, but the data persistence question always arises:

If the database is “in-memory,” what guarantees are there that the data will be fully persistent and always available?

This section will dive deep into the details of in-memory, distributed, SQL relational database systems and how they can be architected to guarantee data durability and high availability. Figure 8-1 presents a high-level architecture that illustrates how an in-memory database could provide these guarantees.

Figure 8-1. In-memory database persistence and high availability

Data Durability

For data storage to be durable, it must survive in the event of a server failure. After the server failure, the data should be recoverable into a transactionally consistent state without any data loss or ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required