O'Reilly logo

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 2. Storage

After the overview of Hadoop in the previous chapter, we will now start looking at its various component parts in more detail. We will start at the conceptual bottom of the stack in this chapter: the means and mechanisms for storing data within Hadoop. In particular, we will discuss the following topics:

  • Describe the architecture of the Hadoop Distributed File System (HDFS)
  • Show what enhancements to HDFS have been made in Hadoop 2
  • Explore how to access HDFS using command-line tools and the Java API
  • Give a brief description of ZooKeeper—another (sort of) filesystem within Hadoop
  • Survey considerations for storing data in Hadoop and the available file formats

In Chapter 3, Processing – MapReduce and Beyond, we will describe how Hadoop ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required