CHAPTER 17

image

Building a YARN Application

Hadoop 2.0 allows a developer to plug in to other frameworks. A large number of frameworks have developed around data stored in the HDFS, and some have been covered in this book (HAMA and Spark, for example). Some of these frameworks were developed to overcome the limitations of using MapReduce for all types of problems. For example, the key limitation of MapReduce is that each MapReduce phase reads and writes data to the HDFS, so iterative algorithms run several times slower in MapReduce. Each iteration is a separate MapReduce job that reads the output of the earlier iteration’s MapReduce job from the HDFS. ...

Get Pro Apache Hadoop, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.