O'Reilly logo

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4. Real-time Computation with Samza

The previous chapter discussed YARN, and frequently mentioned the breadth of computational models and processing frameworks outside of traditional batch-based MapReduce that it enables on the Hadoop platform. In this chapter and the next, we will explore two such projects in some depth, namely Apache Samza and Apache Spark. We chose these frameworks as they demonstrate the usage of stream and iterative processing and also provide interesting mechanisms to combine processing paradigms. In this chapter we will explore Samza and cover the following topics:

  • What Samza is and how it integrates with YARN and other projects such as Apache Kafka
  • How Samza provides a simple callback-based interface for stream processing ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required