Merging rows of two streams with the same or different structures

It's a common requirement to combine two or more streams into a single stream that includes the union of all rows. In these cases, the streams come from different sources and don't always have the same structure. Consequently, combining the streams is not as easy as just putting in a step that freely joins the streams. Issues can quickly arise if row formats and column orders are mixed between streams. This recipe gives you the tips to make it easier.

Suppose that you received data about roller coasters from two different sources. The data in one of those sources looks like the following:

roller_coaster|speed|park|location|country|Year Top Thrill Dragster|120 mph|Cedar Point|Sandusky, ...

Get Pentaho Data Integration Cookbook Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.