O'Reilly logo

Hadoop: The Definitive Guide by Tom White

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Appendix B. Cloudera’s Distribution for Hadoop

by Matt Massie and Todd Lipcon, Cloudera

Cloudera’s Distribution for Hadoop is based on the most recent stable version of Apache Hadoop with numerous patches, backports, and updates. Cloudera shares this distribution in a number of different formats: compressed tar files, RPMs, Debian packages, and Amazon EC2 AMIs. Cloudera’s Distribution for Hadoop is free, released under the Apache 2.0 license and available at http://www.cloudera.com/hadoop/.

Cloudera has an online configurator at http://www.cloudera.com/configurator to make setting up a Hadoop cluster easy (Figure B-1). The configurator has a simple wizard-like interface that asks targeted questions about your cluster. When you’ve finished, the configurator generates customized Hadoop packages and places them in a package repository for you. You can manage any number of clusters and return at a later time to update your active configurations.

Cloudera’s on-line configurator makes it easy to set up a Hadoop cluster

Figure B-1. Cloudera’s on-line configurator makes it easy to set up a Hadoop cluster

To simplify package management, Cloudera shares RPMs from a yum repository and Debian packages from an apt repository. Cloudera’s Distribution for Hadoop enables you to install and configure Hadoop on each machine in your cluster by running a single, simple command. Kickstart users benefit even more by being able to commission entire Hadoop clusters automatically ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required