Getting data into Hadoop

Now that we have put in all that up-front effort, let us look at ways of bringing the data out of MySQL and into Hadoop.

Using MySQL tools and manual import

The simplest way to export data into Hadoop is to use existing command-line tools and statements. To export an entire table (or indeed an entire database), MySQL offers the mysqldump utility. To do a more precise export, we can use a SELECT statement of the following form:

SELECT col1, col2 from table
INTO OUTFILE '/tmp/out.csv'
FIELDS TERMINATED by ',', LINES TERMINATED BY '\n';

Once we have an export file, we can move it into HDFS using hadoop fs -put or into Hive through the methods discussed in the previous chapter.

Have a go hero – exporting the employee table into ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.