O'Reilly logo

Storm Real-time Processing Cookbook by Quinton Anderson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6. Integrating Storm and Hadoop

In this chapter, we will cover:

  • Implementing TF-IDF in Hadoop
  • Persisting documents from Storm
  • Integrating the batch and real-time views

Introduction

In Chapter 4, Distributed Remote Procedure Calls, we implemented the Speed layer for a Lambda architecture instance using Storm. In this chapter, we will implement the Batch and Service layers to complete the architecture.

There are some key concepts underlying this big data architecture:

  • Immutable state
  • Abstraction and composition
  • Constrain complexity

Immutable state is the key, in that it provides true fault-tolerance for the architecture. If a failure is experienced at any level, we can always rebuild the data from the original immutable data. This is in contrast to ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required