Comparing files and folders

Kettle allows you to compare files and folders through the following job entries: File Compare and Compare folder. In this recipe, you will use the first of those entries, which is used for comparing the content of two files. Assume that periodically you receive a file with new museums data to incorporate into your database. You will compare the new and the previous version of the file. If the files are equal, you do nothing, but if they are different, you will read the new file.

Getting ready

To create and test this recipe, you will need two files: the older version of the museum file (LastMuseumsFileReceived.xml), and the new file (NewMuseumsFileReceived.xml).

On the book's website, you will find sample files to play ...

Get Pentaho Data Integration Cookbook Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.