You are previewing Instant Apache Sqoop.
O'Reilly logo
Instant Apache Sqoop

Book Description

Transfer data efficiently between RDBMS and the Hadoop ecosystem using the robust Apache Sqoo

  • Learn something new in an Instant! A short, fast, focused guide delivering immediate results

  • Learn how to transfer data between RDBMS and Hadoop using Sqoop

  • Add a third-party connector into Sqoop

  • Export data from Hadoop and Hive to RDBMS

  • Describe third-party Sqoop connectors

In Detail

In today’s world, data size is growing at a very fast rate, and people want to perform analytics by combining different sources of data (RDBMS, Text, and so on). Using Hadoop for analytics requires you to load data from RDBMS to Hadoop and perform analytics on that data, before then loading that process data back to RDBMS to generate business reports.

Instant Apache Sqoop is a practical, hands-on guide that provides you with a number of clear, step-by-step exercises that will help you to take advantage of the real power of Apache Sqoop and give you a good grounding in the knowledge required to transfer data between RDBMS and the Hadoop ecosystem.

Instant Apache Sqoop looks at the import/export process required in data transfer and discusses examples of each process. It will also give you an overview of HBase and Hive table structures and how you can populate HBase and Hive tables. The book will finish by taking you through a number of third-party Sqoop connectors.

You will also learn about various import and export arguments and how you can use these arguments to move data between RDBMS and the Hadoop ecosystem. This book also explains the architecture of import and export processes. The book will also take a look at some Sqoop connectors and will discuss examples of each connector. If you want to move data between RDBMS and the Hadoop ecosystem, then this is the book for you.

You will learn everything that you need to know to transfer data between RDBMS and the Hadoop ecosystem as well as how you can add new connectors into Sqoop.

Table of Contents

  1. Instant Apache Sqoop
    1. Instant Apache Sqoop
    2. Credits
    3. About the Author
    4. About the Reviewer
    5. www.PacktPub.com
      1. Support files, eBooks, discount offers and more
        1. Why Subscribe?
        2. Free Access for Packt account holders
    6. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    7. 1. Instant Apache Sqoop
      1. Working with the import process (Intermediate)
        1. Getting ready
        2. How to do it...
        3. How it works...
          1. Architecture of the import process
          2. Importing a single table
          3. Field and line terminator
          4. Supported output format
          5. Direct access mode
          6. Importing selected columns
          7. Importing selected rows
        4. There's more...
          1. Free form query imports
          2. Importing all tables
          3. Parallelism arguments
      2. Incremental import (Simple)
        1. Getting ready
        2. How to do it...
        3. How it works...
      3. Populating the HBase table (Simple)
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      4. Importing data into HBase (Intermediate)
        1. Getting ready
        2. How to do it...
        3. How it works...
          1. Importing a primary key table into HBase
          2. Importing a non-primary key table into HBase
      5. Populating the Hive table (Simple)
        1. Getting ready
        2. How to do it…
        3. How it works...
        4. There's more…
      6. Importing data into Hive (Simple)
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
          1. Importing a primary key table into Hive
          2. Importing a non-primary key table into Hive
      7. The exporting process (Intermediate)
        1. Getting ready
        2. How to do it...
        3. How it works...
          1. Exporting the selected HDFS directory
          2. Inputting the parsing arguments
          3. Exporting update commands
      8. Exporting data from Hive (Simple)
        1. How to do it...
        2. How it works...
      9. Using Sqoop connectors (Advanced)
        1. How to do it...
        2. How it works...
          1. Couchbase Sqoop connector
          2. The Netezza Sqoop connector
          3. Microsoft SQL Server connector
          4. Quest Data connector
          5. Teradata Sqoop connector