Realizing a Lambda architecture

For this use case, we are focusing on a distributed computing pattern that integrates a real-time processing platform (that is, Storm) with an analytics engine (that is, Druid); we then pair it with an offline batch processing mechanism (that is, Hadoop) to ensure we have accurate historical metrics.

While that remains the focus, the other key goal we are trying to achieve is continuous availability and fault tolerance. More specifically, the system should tolerate the permanent loss of a node or even a data center. To achieve this kind of availability and fault tolerance, we need to focus a bit more on the persistence.

In a live system, we would use a distributed storage mechanism for persistence, ideally a storage ...

Get Storm Blueprints: Patterns for Distributed Real-time Computation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.