Aggregating data with a Row Denormaliser step

In the previous section, you learned how to use the Row Denormaliser step to combine several rows into one. This step can also be used to generate a new dataset with aggregated or consolidated data. If you take a look at the file with films, you will notice that the first of the films has two directors. However, when we denormalized the data, PDI picked only the first one to fill in the Directed by field. By aggregating fields, we can easily fix this situation. Let's modify the Transformation that you created earlier and do a couple of modifications: we will fill in the Directed by field with the list of directors and we will create a new field with the number of directors for each film:

  1. Open ...

Get Learning Pentaho Data Integration 8 CE - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.