Chapter 3Service Configuration and Coordination

The early development of high-performance computing was primarily focused on scale-up architectures. These systems had multiple processors with a shared memory bus. In fact, modern multicore server hardware is architecturally quite similar to the supercomputer hardware of the 1980s (although modern hardware usually foregoes the Cray's bench seating).

As system interlinks and network hardware improved, these systems began to become more distributed, eventually evolving into the Network of Workstations (NOW) computing environments used in most present-day computing environments. However, these interlinks usually do not have the bandwidth required to simulate the shared resource environments of a scale-up architecture nor do they have the reliability of interlinks used in a scale-up environment. Distributed systems also introduce new failure modes that are not generally found in a scale-up environment.

This chapter discusses systems and techniques used to overcome these problems in a distributed environment, in particular, the management of shared state in distributed applications. Configuration management and coordination is first discussed in general. The most popular system, used by most of the other software in this book, is ZooKeeper. This system was developed by Yahoo! to help manage its Hadoop infrastructure and is discussed extensively because it is used by most of the other distributed software systems in this book.

Motivation ...

Get Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.