O'Reilly logo

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5. Iterative Computation with Spark

In the previous chapter, we saw how Samza can enable near real-time stream data processing within Hadoop. This is quite a step away from the traditional batch processing model of MapReduce, but still keeps with the model of providing a well-defined interface against which business logic tasks can be implemented. In this chapter we will explore Apache Spark, which can be viewed both as a framework on which applications can be built as well as a processing framework in its own right. Not only are applications being built on Spark, but entire components within the Hadoop ecosystem are also being reimplemented to use Spark as their underlying processing framework. In particular, we will cover the following ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required