O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Learning Apache Hadoop

Video Description

In this Introduction to Hadoop training course, expert author Rich Morrow will teach you the tools and functions needed to work within this open-source software framework. This course is designed for the absolute beginner, meaning no prior experience with Hadoop is required.
You will start out by learning the basics of Hadoop, including the Hadoop run modes and job types and Hadoop in the cloud. You will then learn about the Hadoop distributed file system (HDFS), such as the HDFS architecture, secondary name node, and access controls. This video tutorial will also cover topics including MapReduce, debugging basics, hive and pig basics, and impala fundamentals. Finally, Rich will teach you how to import and export data.
Once you have completed this computer based training video, you will be fully capable of using the tools and functions you’ve learned to work successfully in Hadoop. Working files are included, allowing you to follow along with the author throughout the lessons.

Table of Contents

  1. Introduction
    1. What Is Big Data? 00:06:25
    2. About The Author 00:02:50
    3. Historical Approaches 00:07:50
    4. Big data In The Modern World 00:10:20
    5. The Hadoop Approach 00:09:56
    6. Hadoop Hardware Requirements 00:11:26
    7. Hadoop Core Vs. Ecosystem 00:06:52
    8. Hadoopable Problems 00:06:43
    9. Hadoop Support Companies 00:05:40
    10. How To Access Your Working Files 00:01:15
  2. Hadoop Basics
    1. HDFS And MapReduce 00:07:56
    2. Hadoop Run Modes And Job Types 00:05:48
    3. Hadoop Software Requirements And Recommendations 00:03:16
    4. Hadoop in the Cloud - Amazon Web Services 00:04:51
    5. Lab - Installing Hadoop From CDH With Cloudera Manager - Part 1 00:10:32
    6. Lab - Installing Hadoop From CDH With Cloudera Manager - Part 2 00:05:47
    7. Lab - Installing Hadoop From CDH With Cloudera Manager - Part 3 00:10:14
    8. Lab - Installing Hadoop From CDH With Cloudera Manager - Part 4 00:08:46
    9. Introduction To Hive And Pig Interface 00:05:48
    10. Installing Cloudera Quickstart VM 00:10:59
  3. Hadoop Distributed File System (HDFS)
    1. HDFS Architecture 00:08:11
    2. HDFS File Write Walkthrough 00:12:19
    3. Secondary Name Node 00:03:33
    4. Lab - Using HDFS - Part 1 00:07:31
    5. Lab - Using HDFS - Part 2 00:09:13
    6. HA And Federation Basics 00:07:01
    7. HDFS Access Controls 00:07:23
  4. MapReduce
    1. MapReduce Explained 00:10:44
    2. MapReduce Architecture 00:05:26
    3. MapReduce Code Walkthrough - Part 1 00:11:50
    4. MapReduce Code Walkthrough - Part 2 00:15:48
    5. MapReduce Job Walkthrough 00:09:49
    6. Rack Awareness 00:03:47
    7. Advanced MapReduce - Partioners, Combiners, Comparators And More 00:11:54
    8. Partitioner Code Walkthrough 00:08:02
    9. Java Concerns 00:10:10
  5. Logging And Debugging
    1. Debugging Basics 00:15:04
    2. Benchmarking With Teragen And Terasort 00:10:15
  6. Hive, Pig, And Impala
    1. Comparing Hive, Pig And Impala 00:05:30
    2. Hive Basics 00:08:36
    3. Hive Patterns And Anti-Patterns 00:01:54
    4. Lab - Hive Basic Usage 00:19:26
    5. Pig Basics 00:08:37
    6. Pig Patterns And Anti-Patterns 00:02:13
    7. Lab - Pig Basic Usage 00:22:50
    8. Impala Fundamentals 00:05:36
  7. Data Import And Export
    1. Import And Export Options 00:08:13
    2. Flume Introduction 00:08:11
    3. Lab - Using Flume 00:13:48
    4. HDFS Interaction Tools 00:06:43
    5. Sqoop Introduction 00:06:46
    6. Lab - Using Sqoop 00:18:32
    7. Oozie Introduction 00:07:01
  8. Conclusion
    1. Wrap-Up 00:02:28