© Deepak Vohra 2016

Deepak Vohra, Practical Hadoop Ecosystem, 10.1007/978-1-4842-2199-0_5

5. Apache Sqoop

Deepak Vohra

(1)Apt 105, White Rock, British Columbia, Canada

Apache Sqoop is a tool for transferring large quantities of data between a relational database, such as MySQL and Oracle database, and the Hadoop ecosystem, which includes the Hadoop Distributed File System (HDFS), Apache Hive, and Apache HBase. While Sqoop supports transfer between a relational database and HDFS bi-directionally, Sqoop only supports transfer from a relational database to Apache Hive and Apache HBase uni-directionally. The data transfer paths supported by Apache Sqoop are illustrated in Figure 5-1.

Figure 5-1. Apache Sqoop data transfer paths

The main commands supported ...

Get Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.