Chapter 9. Providing a Big Data Integration Layer with Apache Hadoop

In this chapter, we will cover the following topics:

  • Installing Hadoop client bundles in Apache Karaf
  • Accessing Apache Hadoop from Karaf
  • Adding commands that talk to HDFS for deployment in Karaf

Introduction

To continue building on data storage models that aren't as inflexible as traditional RDBMS structures, we will look at Apache Hadoop. Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Cutting, who was working at Yahoo! at the time, named it after his son's toy elephant. It was originally developed to support distribution for the Nutch search engine project.

Hadoop followed the ideas published by Google in the papers pertaining to Google File System and Google MapReduce. ...

Get Apache Karaf Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.