Validating data at runtime

While processing data there will eventually come a time where it is critical to validate the data while in stream, to ensure it is of enough high quality to continue executing the process. Kettle comes with several built-in steps that provide validation capabilities, including a generic Data Validator step which allows for data to be processed with a custom set of rules. For this recipe, we will be building some custom rules to validate author data from the books' database.

Getting ready

You must have a database that matches the books' data structures, as listed in Appendix A, Data Structures. The code to build this database is available from Packt's website.

How to do it...

Perform the following steps:

  1. Create a new transformation. ...

Get Pentaho Data Integration Cookbook Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.