You are previewing Getting Started with Talend Open Studio for Data Integration.
O'Reilly logo
Getting Started with Talend Open Studio for Data Integration

Book Description

This is the complete course for anybody who wants to get to grips with Talend Open Studio for Data Integration. From the basics of transferring data to complex integration processes, it will give you a head start.

  • Develop complex integration jobs without writing code

  • Go beyond “extract, transform and load” by constructing end-to-end integrations

  • Learn how to package your jobs for production use

  • In Detail

    Talend Open Studio for Data Integration (TOS) is an open source graphical development environment for creating custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files and connect individual components in order to define complex integration processes.

    "Getting Started with Talend Open Studio for Data Integration" illustrates common uses and scenarios in a simple, practical manner and, building on knowledge as the book progresses, works towards more complex integration solutions.

    TOS is a code generator and so does a lot of the “heavy lifting” for you. As such, it is a suitable tool for experienced developers and non-developers alike. You'll start by learning how to construct some common integrations tasks – transforming files and extracting data from a database, for example. These building blocks form a “toolkit” of techniques that you will learn how to apply in many different situations.

    By the end of the book, once complex integrations will appear easy and you will be your organization’s integration expert!

    Best of all, TOS makes integrating systems fun!

    Table of Contents

    1. Getting Started with Talend Open Studio for Data Integration
      1. Getting Started with Talend Open Studio for Data Integration
      2. Credits
      3. Foreword
      4. Foreword
      5. About the Author
      6. Acknowledgement
      7. About the Reviewers
        1. Support files, eBooks, discount offers, and more
          1. Why Subscribe?
          2. Free Access for Packt account holders
      9. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      10. 1. Knowing Talend Open Studio
        1. What Talend Open Studio is
          1. Use cases
          2. History of Talend Open Studio
          3. Benefits of Talend Open Studio
        2. Installing Talend Open Studio
          1. Prerequisites
          2. Installation guide
        3. Other useful software
          1. Text editor
          2. MySQL
        4. Sample jobs and data
        5. Summary
      11. 2. Working with Talend Open Studio
        1. Studio definitions
        2. Starting the Studio
        3. Tour of the Studio
          1. The Repository
          2. The design workspace
          3. The Palette
          4. Configuration tabs
          5. Outline and Code panels
        4. Creating a new project
        5. Creating an example job
        6. Metadata
        7. Summary
      12. 3. Transforming Files
        1. Transforming XML to CSV
        2. Transforming CSV to XML
        3. Maps and expressions
        4. Advanced XML output for complex XML structures
        5. Working with multi-schema XML files
        6. Enriching data with lookups
        7. Extracting data from Excel files
          1. Extracting data from multiple sheets
          2. Joining data from multiple sheets
        8. Summary
      13. 4. Working with Databases
        1. Database metadata
        2. Extracting data from a database
        3. Extracts from multiple tables
          1. Joining within the database component
          2. Joining outside the database component
        4. Writing data to a database
        5. Database to database transfer
        6. Modifying data in a database
        7. Dynamic database lookup
        8. Summary
      14. 5. Filtering, Sorting, and Other Processing Techniques
        1. Filtering data
          1. Simple filter
          2. Filter and rejects
          3. Filter and split
        2. Sorting data
        3. Aggregating data
        4. Normalizing and denormalizing data
          1. Data normalization
          2. Data denormalization
        5. Extracting delimited fields
        6. Find and replace
        7. Sampling rows
        8. Summary
      15. 6. Managing Files
        1. Managing local files
          1. Copying files
          2. Copying and removing files
          3. Renaming files
          4. Deleting files
          5. Timestamping a file
          6. Listing files in a directory
          7. Checking for files
          8. Archiving and unarchiving files
        2. FTP file operations
          1. FTP Metadata
          2. FTP Put
          3. FTP Get
          4. FTP File Exist
          5. FTP File List and Rename
          6. Deleting files on an FTP server
        3. Summary
      16. 7. Job Orchestration
        1. What is a subjob
        2. A simple subjob
        3. On Subjob Error
        4. On Component OK
        5. Run If
        6. Jobs as subjobs
        7. Iterating and looping
          1. Iterate connections
          2. ForEach loop
          3. Loop "n" times
          4. Infinite loop
        8. Duplicating and merging dataflows
          1. Duplicating data
          2. Merging data
        9. Summary
      17. 8. Managing Jobs
        1. Job versions
        2. Exporting and importing jobs
          1. Exporting jobs
            1. Exporting a project
            2. Exporting a job
            3. Exporting a job for execution
          2. Importing jobs
            1. Importing a project
            2. Importing a job
        3. Scheduling jobs
        4. Summary
      18. 9. Global Variables and Contexts
        1. Global variables
          1. Studio global variables
          2. User defined global variables
        2. Contexts
          1. Embedded context variables
          2. Repository context variables
          3. External context variables
          4. Complex context variables
          5. Using embedded, repository, and external contexts
        3. Summary
      19. 10. Worked Examples
        1. Product catalog
          1. Data import from the ERP system
          2. Data import from Fabric Fashions
          3. Data import from Runway Collections
        2. Product inventory data
        3. Order file processing
        4. Order status updates
        5. Automating processes
          1. E-mailing daily sales
          2. Automating product visibility
        6. Summary
      20. A. Installing Sample Jobs and Data
        1. Downloading job and data files
          1. Sample data files
          2. Sample database
          3. Sample jobs
      21. B. Resources
        1. Talend documentation
        2. TalendForge forum
        3. Webinars
        4. Tutorials
        5. Talend Exchange