Running MapReduce jobs from Oozie
We will see how to write a simple MapReduce job for word count and schedule it via Oozie. Later, we will wrap this in our first Coordinator job. Along this journey, we will learn some concepts and apply them in examples.
I have already saved one word count Java MapReduce code, which we will try to run over our input data. Let's dive into the code. You can check out the mapreduce
folder in Book_Code_Folder/learn_oozie/ch04/
.
Note
Check the workflow_0.5.xsd
file in the xsd_svg
folder and note the inputs needed for the MapReduce action to run.
The Workflow is shown in the following code and we can see the arguments are the same as the one we need in the Hadoop jar
command for running a MapReduce job. At the start of ...
Get Apache Oozie Essentials now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.