Chapter 14

Integrating Hadoop with Relational Databases Using Sqoop

In This Chapter

arrow Introducing Sqoop

arrow Looking at the nuts and bolts of Sqoop

arrow Importing data with Sqoop

arrow Exporting data with Sqoop

arrow Customizing your Sqoop input and output formats

arrow Looking ahead to Sqoop 2.0

Performing analytics on large, diverse data sets is a natural fit for Apache Hadoop. The whole point of the Hadoop File System (HDFS) is that it excels at providing a massively scalable, diverse data store that, when combined with the many analytic tools available on the Hadoop platform — from Map Reduce to Mahout and others — gives you a lean, mean, analytics machine when you hitch your data store wagon to Apache Hadoop.

This rosy picture presents a slight problem, however: It turns out that most of the world’s structured data is already stored in relational database management systems (RDBMSs), and it’s common practice ...

Get Hadoop For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.