Formatting the results of MapReduce computations – using Hadoop OutputFormats

Often the output of your MapReduce computation will be consumed by other applications. Hence, it is important to store the result of a MapReduce computation in a format that can be consumed efficiently by the target application. It is also important to store and organize the data in a location that is efficiently accessible by your target application. We can use Hadoop OutputFormat interface to define the data storage format, data storage location, and the organization of the output data of a MapReduce computation. An OutputFormat prepares the output location and provides a RecordWriter implementation to perform the actual serialization and storage of data.

Hadoop uses ...

Get Hadoop MapReduce v2 Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.