Pig Coordinator job v2

We will improve our Coordinator using the concept of Datasets. The code for this section is available at BOOK_CODE_HOME/learn_oozie/ch05/rainfall/v2.

The goal of this section is very simple. We need to learn which dataset instance should be used for processing using the Coordinator dataset parameterization function. We will see them shortly.

The Coordinator for our problem statement is shown in the upcoming screenshot. We are using the Dataset by declaring the definition in line 5 of the screenshot. The corresponding Dataset is defined in the datasets.xml file, as shown in the following code:

<datasets> <dataset name="rainfall" frequency="${coord:months(1)}" initial-instance="2015-01-01T00:00Z" timezone="Australia/Sydney"> ...

Get Apache Oozie Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.