You are previewing Instant Pentaho Data Integration Kitchen.
O'Reilly logo
Instant Pentaho Data Integration Kitchen

Book Description

Explore the world of Pentaho Data Integration command-line tools which will help you use the Kitchen

  • Learn something new in an Instant! A short, fast, focused guide delivering immediate results

  • Understand how to discover the repository structure using the command line scripts

  • Learn to configure the log properly and how to gather the information that helps you investigate any kind of problem

  • Explore all the possible ways to start jobs and learn transformations without any difficulty

In Detail

Pentaho PDI is a modern, powerful, and easy-to-use ETL system that lets you develop ETL processes with simplicity. Explore and gain the experience and skills that you need to run processes from the command line or schedule them by using an extensive description and a good set of samples.

Instant Pentaho Data Integration Kitchen How-to will help you to understand the correct way to deal with PDI command line tools. We start with a recipe about how to configure your memory requirements to run your processes effectively and then move forward with a set of recipes that show you the different ways to start PDI processes.

We start with a recap about how transformations and jobs are designed using spoon and then move forward to configure memory requirements to properly run your processes from the command line.

We dive into the various flags that control the logging system by specifying the logging output and the log verbosity. We focus and deliver all the knowledge you require to run the ETL processes using command line tools with ease and in a proficient manner.

Table of Contents

  1. Instant Pentaho Data Integration Kitchen
    1. Instant Pentaho Data Integration Kitchen
    2. Credits
    3. About the Author
    4. About the Reviewer
    5. www.PacktPub.com
      1. Support files, eBooks, discount offers and more
        1. Why Subscribe?
        2. Free Access for Packt account holders
    6. Preface
      1. How the story began…
      2. Kettle components
      3. What this book covers
      4. What you need for this book
      5. Who this book is for
      6. Conventions
      7. Reader feedback
      8. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    7. 1. Instant Pentaho Data Integration Kitchen
      1. Designing a simple PDI transformation (Simple)
        1. Getting ready
        2. How to do it...
        3. There's more...
          1. How to quickly find the steps to use
      2. Designing a simple PDI job (Simple)
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
          1. Why a proper naming for tasks and steps is so important
          2. Using internal variables to write location-independent processes
      3. The important role of icon and color indicators
      4. Configuring command-line tools to run properly (Simple)
        1. Getting ready
        2. How to do it...
        3. There's more...
          1. Making things easier by writing custom scripts
      5. Executing PDI jobs from a filesystem (Simple)
        1. Getting ready
        2. How to do it…
      6. Executing PDI jobs packaged in archive files (Intermediate)
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
          1. Changes in job and transformation design
      7. Executing PDI jobs from the repository (Simple)
        1. Getting ready
        2. How to do it...
        3. There's more...
          1. Changes in job and transformation design
          2. How to define a filesystem repository
          3. Defining a database repository
      8. Dealing with the execution log (Simple)
        1. Getting ready
        2. How to do it...
        3. There's more...
          1. Understanding the log to identify where our process fails
          2. Separating execution logfiles by date and time
      9. Discovering your PDI repository from the command line (Simple)
        1. Getting ready
        2. How to do it...
      10. Exporting jobs and transformations to the .zip files (Simple)
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      11. Managing PDI processes return code (Simple)
        1. Getting ready
        2. How to do it...
        3. There's more...
          1. A summary of Kitchen/Pan exit codes
      12. Scheduling PDI jobs and transformations (Intermediate)
        1. Getting ready
        2. How to do it...
        3. There's more...
          1. Understanding crontab malfunctions