O'Reilly logo
live online training icon Live Online training

Write Your First Hadoop MapReduce Program

Diving Head First into Hadoop

Jesse Anderson

Join expert Jesse Anderson to learn how to create powerful programs using Hadoop MapReduce. In just three hours, you’ll dive into the information that developers and technical teams need to be successful with big data and the concepts behind the most popular open source framework, Apache Hadoop. You’ll explore Apache Hadoop and its two main components, HDFS and MapReduce, as you work through hands-on exercises demonstrating how to create a MapReduce job with the MapReduce API and write your own MapReduce job.

What you'll learn-and how you can apply it

By the end of this live, online course, you’ll understand:

  • What HDFS is and what its limitations are
  • What a shuffle sort is and how it works
  • How to create a MapReduce program

And you’ll be able to:

  • Write a MapReduce program
  • Use an IDE to debug your MapReduce program
  • Access data stored in HDFS

This training course is for you because...

  • You’re a software engineer who wants to write Hadoop MapReduce code.
  • You’re a software architect who needs to understand how data flows through Hadoop MapReduce.
  • You’re a business analyst who needs to write a more complex analysis with Hadoop MapReduce.
  • You’re a business intelligence analyst who wants to learn how to run complex analytics at scale.
  • You’re a quality assurance engineer who needs to test Hadoop MapReduce code.

Prerequisites

  • All attendees will need a technical background. To program, the attendee will need to be able to program in one of the following languages: Java, Ruby, Python, or Perl.
  • If you are taking this class at your place of work, verify with your network administrator that you can access ports 4822 and 8080. If those ports aren't opened, please ask your network administrator to open them.

VIRTUAL MACHINE SETUP INSTRUCTIONS NEEDED PRIOR TO CLASS

https://www.dropbox.com/s/6bl5yag85ur2bt0/Write_Your_First_Hadoop_MapReduce_Program_VM_Instructions.pdf?dl=0

About your instructor

  • Jesse Anderson is the Managing Director at Big Data Institute. He trains at companies ranging from startups to Fortune 100 companies on Big Data. This includes training on cutting edge technology like Apache Kafka, Apache Hadoop and Apache Spark. He has taught thousands of students the skills to become Data Engineers.

    He is widely regarded as an expert in the field and his novel teaching practices. Jesse is published on O’Reilly and Pragmatic Programmers. He has been covered in prestigious publications such as The Wall Street Journal, CNN, BBC, NPR, Engadget, and Wired.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

  • Introduction (10 minutes)
  • Lecture: MapReduce overview—the concepts behind MapReduce; how MapReduce works (30 minutes)
  • Lecture and demonstration: Coding with MapReduce—how to create a MapReduce job with the MapReduce API (80 minutes)
  • Break (15 minutes)
  • Hands-on exercise: Writing your first MapReduce job (60 minutes)