Making a performance difference when looking up data in a database

Database lookups are costly and can severely impact Transformation performance. However, performance can be significantly improved using the cache feature of the Database lookup step. In order to enable the cache feature, just check the Enable cache? option.

This is how it works. Think of the cache as a buffer of high-speed memory that temporarily holds frequently requested data. By enabling the cache option, PDI will look first in the cache and then in the database:

  • If the table that you look up has few records, you could preload the cache with all of the data in the lookup table. Do this by checking the Load all data from table option. This will give you the best performance. ...

Get Learning Pentaho Data Integration 8 CE - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.