O'Reilly logo

Cassandra High Performance Cookbook by Edward Capriolo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

A pseudo-distributed Hadoop setup

A production Hadoop cluster can span from a single node to thousands of computers. Each cluster has one of each of these components:

  • NameNode: Component that stores the file system metadata
  • Secondary NameNode: Checkpoints the NameNode
  • JobTracker: Component in charge of Job Scheduling

These components are installed on multiple machines:

  • TaskTracker: Component that runs individual tasks of a job
  • DataNode: Component that stores data to disk

The communication between the components is depicted in the following image:

A pseudo-distributed Hadoop setup

For Hadoop to be effective at grid computing, it needs to be installed on multiple machines, but the stack can ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required