O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Learning Path: Hadoop 2: Setting Up and Importing Data with Sqoop and Flume

Video Description

Get started with Hadoop 2 and importing data with Sqoop and Flume

In Detail

Apache Hadoop is an open source framework for distributed storage and processing of Big Data. Apache Sqoop and Apache Flume, one of the many widely used components of the famed Hadoop ecosystem, are used to import data into Hadoop from external sources.

We first begin by setting up Hadoop by downloading, installing, and configuring it and exploring Hue, which is an interface for analyzing Hadoop data and a very useful tool. We then move on to learning about importing data in Hadoop, where we first learn how to manually import data. Next, we learn how to import databases using Apache Sqoop. Finally, we move on to importing real-time data and streaming data, using Apache Flume, into Hadoop.

By the end of this Learning Path, you will have learnt all about configuring and setting up the Hadoop framework from scratch, as well as importing data into the Hadoop distributed storage environment from various sources, so as to get started with processing the data.

Prerequisites: Basic knowledge of the Elastic stack and Elasticsearch.

Resources: Code Downloads:

  • Hadoop 2: Setting Up and Importing Data with Sqoop and Flume