This chapter provides an in-depth look at the art of performance tuning Kettle. We primarily focus on tuning transformations and briefly look at what can go wrong with the performance in a job.
For readers who are interested in the internals of the transformation engine, the first part of this chapter offers many details with a number of examples. Once you have learned how the transformation engine works, we focus on how to identify performance bottlenecks. Then we offer advice on how to improve the performance of your transformations and jobs.
Readers who are new to Kettle may prefer to skip this chapter until they encounter a performance problem. At that point, you can simply turn to this chapter to learn how to identify and solve the problems you're encountering.
Performance tuning of a transformation is conceptually quite simple. As in any other network, you search for the weakest link. In the case of a transformation, you search for the step that is causing the performance of the transformation to be sub-optimal. To better understand why this is important, take a look at a simple example. The following transformation reads customer data from one database and writes it into another, as shown in Figure 15-1. The figure also shows the step performance metrics during execution at the bottom.
Figure 15.1. Reading ...