Reading spreadsheets

Spreadsheets are also very common kinds of files used in Extract, Transform, and Load (ETL) processes. The PDI step for reading spreadsheets is Microsoft Excel Input. Both Excel 97-2003 (XLS) and Excel 2007 (XLSX) files are allowed. Despite the name of the step, it also allows to read Open Office (ods) files.

The main difference between this step and the steps that read plain files is that in the Microsoft Excel Input step you have the possibility to specify the name of the sheet to read. For a given sheet, you will provide the name as well as the row and column to start at.

Take into account that the row and column numbers in a sheet start at 0.

You can read more than one sheet at a time, as long as all share the same ...

Get Learning Pentaho Data Integration 8 CE - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.