O'Reilly logo

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

This chapter focused much more on what can be done on Hadoop 2, and in particular YARN, than the details of Hadoop internals. This is almost certainly a good thing, as it demonstrates that Hadoop is realizing its goal of becoming a much more flexible and generic data processing platform that is no longer tied to batch processing. In particular, we highlighted how Samza shows that the processing frameworks that can be implemented on YARN can innovate and enable functionality vastly different from that available in Hadoop 1.

In particular, we saw how Samza goes to the opposite end of the latency spectrum from batch processing and enables per-message processing of individual messages as they arrive.

We also saw how Samza provides a callback ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required