Chapter 4. Running MapReduce Jobs

In this chapter, we will learn how to run MapReduce jobs using Oozie. MapReduce jobs are of two types: Java MapReduce jobs and Streaming jobs. Streaming jobs are written in languages other than Java. We will also enter in to the world of when part of Workflow execution using Coordinators to schedule our jobs.

In this chapter, we will do the following:

  • Run Java MapReduce jobs from Oozie
  • Run Streaming jobs from Oozie
  • Run Coordinator jobs

From the concept point of view, we will:

  • Understand the concept of Coordinators
  • Understand the concept of cron-based frequency schedules
  • Understand the importance of timezone in Oozie
  • Understand the concept of Datasets

Chapter case study

The customer for whom we work also keeps track of what ...

Get Apache Oozie Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.