For taking advantage of processing power, distributing rows is a good option. It gives you a better performance, which is critical if you have a heavy processing work or your dataset is huge.
A step further in the distribution of rows is the concept of partitioning. Partitioning is about splitting the dataset into several smaller datasets, but the distribution is made according to a rule that is applied to the rows.
The standard partitioning method offered by PDI is Remainder of division. You choose a partitioning field, and PDI divides its value by the number of predefined partitions.
As an example, in our sample Transformation, we can create a partitioning schema with three partitions and choose ...