Developing YARN applications

YARN can bring in other computing paradigms to Hadoop. In Hadoop 2.X, MapReduce, Pig, and Hive are all Application Master libraries and their corresponding clients. Developers can write their own applications using the YARN API and leverage the existing infrastructure running Hadoop. Also, enterprises can have lots of data assets in HDFS already, and writing custom applications can leverage this without a need to provision new clusters or migrate the existing data.

Storm is a real-time stream-processing engine that has been ported onto YARN, bringing in the paradigm of moving data to compute nodes. Spark is another project that is on YARN and can leverage the existing Hadoop infrastructure to provide in-memory data ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.