O'Reilly logo

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

In this chapter, we introduced four tools to ease development on Hadoop. In particular, we covered:

  • How Hadoop streaming allows the writing of MapReduce jobs using dynamic languages
  • How Kite Data simplifies interfacing with heterogeneous data sources
  • How Apache Crunch provides a high-level abstraction to write pipelines of Spark and MapReduce jobs that implement common design patterns
  • How Morphlines allows us to declare chains of commands and data transformations that can then be embedded in any Java codebase

In Chapter 10, Running a Hadoop 2 Cluster, we will shift our focus from the domain of software development to system administration. We will discuss how to set up, manage, and scale a Hadoop cluster, while taking aspects such as monitoring ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required