Chapter 12. Data Integration

Microservices are optimized for teams, easily evolving independent parts of the system. Each team works on its own deliverable, or feature, independently of the rest of the organization—a separate codebase, a separate release cycle, and possibly separate technologies! A side effect of this isolation is that services are distributed. Service boundaries are explicit, and data access happens through the service boundary. This implies process distribution and network partitions. In Chapter 9, we looked at how to model bounded contexts and interact with popular data sources like MongoDB or Redis. It’s trivial to stand up individual services that manage their own data source; the question is: how do these nodes communicate? How do they agree upon state?

In this chapter, we’ll look at a few different ways, old and new, to take data from different microservices and integrate them. One of the key concerns we’ll try to address is integrity of the data in the face of distribution. The distributed systems literature is vast and comprehensive. There are seminal papers, such as “Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System,” which breaks down the problem of consistency in a distributed database architecture. There is also Eric Brewer’s “CAP Theorem,” which states that any distributed system can have at most two of three desirable properties:

  • Consistency (C), which is equivalent to having a single, up-to-date copy of the data ...

Get Cloud Native Java now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.