Reading files with some fields occupying two or more rows

When you use one of the Kettle steps devoted for reading files, Kettle expects one entity per row. For example, if you are reading a file with a list of customers, then Kettle expects one customer per row. Suppose that you have a file organized by rows, where the fields are in different columns, but some of the fields span several rows, as in the following example containing data about roller coasters:

Roller Coaster Speed Location Year Kingda Ka 128 mph Six Flags Great Adventure Jackson, New Jersey 2005 Top Thrill Dragster 120 mph Cedar Point Sandusky, Ohio 2003 Dodonpa 106.8 mph Fuji-Q Highland FujiYoshida-shi 2001 Japan Steel Dragon 2000 95 mph Nagashima Spa Land Mie 2000 Japan Millennium ...

Get Pentaho Data Integration Cookbook Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.