Introducing PDI steps useful for cleansing data

Data cleansing, also known as data cleaning or data scrubbing, may be done manually or automatically, depending on the complexity of the cleansing. Knowing in advance the rules that apply, you can do automatic cleaning using any PDI step that suits you.

The following are some steps particularly useful, including the ones that we used in the previous examples:

Step

Purpose

If field value is null

If a field is null, it changes its value to a constant. It can be applied to all fields of the same data type, for example, to all Integer fields or to particular fields.

Null if...

Sets a field value to null if it is equal to a given constant value.

Number range

Creates ranges ...

Get Learning Pentaho Data Integration 8 CE - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.