Discovering metadata and injecting it

Let's move to a use case a bit more elaborate than the previous one. We will continue working with sales data. In this case, we will work with an Excel file named sales_data.xls, which has a single sheet. There are several fields in this file, but we are only interested in the following: PRODUCTLINE, PRODUCTCODE, and QUANTITYORDERED. The problem is that the fields can be in any order in the Excel file. We will only know the order when we read the file.

In the same way as before, we need to create a template with missing data and then a Transformation that injects that data.

Let's start with the template. As we don't have the list of fields, we will fill the Fields grid with generic names—HEADER1, HEADER2 ...

Get Learning Pentaho Data Integration 8 CE - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.