Running MapReduce jobs from Oozie

We will see how to write a simple MapReduce job for word count and schedule it via Oozie. Later, we will wrap this in our first Coordinator job. Along this journey, we will learn some concepts and apply them in examples.

I have already saved one word count Java MapReduce code, which we will try to run over our input data. Let's dive into the code. You can check out the mapreduce folder in Book_Code_Folder/learn_oozie/ch04/.

Note

Check the workflow_0.5.xsd file in the xsd_svg folder and note the inputs needed for the MapReduce action to run.

The Workflow is shown in the following code and we can see the arguments are the same as the one we need in the Hadoop jar command for running a MapReduce job. At the start of ...

Get Apache Oozie Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Apache Oozie Essentials by Jagat Jasjit Singh

Running MapReduce jobs from Oozie

Note

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly