Using Sqoop to move data from RDBMS to Data Lake

Sqoop enables us to transfer data between any relational database and Hadoop. You can import data from any relational database that has a JDBC adaptor such as SQL Server, MySQL, Oracle, Teradata, and others, to HDInsight.

Key benefits

The major benefits of using Sqoop to move data are as follows:

  • Leverages RDBMS metadata to get the column data types
  • It is simple to script and uses SQL
  • It can be used to handle change data capture by importing daily transactional data to HDInsight
  • It uses MapReduce for export and import that enables parallel and efficient data movement

Two modes of using Sqoop

Sqoop can be used to get data into and out of Hadoop; it has two modes of operation:

  • Sqoop import: Data moves from ...

Get HDInsight Essentials - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.