O'Reilly logo
live online training icon Live Online training

Mastering the Cloudera Manager for your Hadoop Cluster

A hands-on course to configure, manage, and optimize your CDH cluster

Amit Rustagi

Many companies today are seeking ways to improve operational efficiencies by leveraging the power of Hadoop. One of the challenges in working with Hadoop is that performance tuning is a very manual process, requiring thorough knowledge of various configurations, and the time required to manage the Hadoop cluster increases as it scales. The Cloudera Distribution of Hadoop (CDH) provides a platform for data crunching and advanced machine learning to help Hadoop architects and administrators improve overall efficiency.

This hands-on course covers an in-depth review of the capabilities of the Cloudera Manager to help you configure, manage, and optimize your CDH Cluster.

What you'll learn-and how you can apply it

By the end of this online course, you'll understand:

  • How to monitor various CDH services and jobs using the Cloudera Manager.
  • How to configure, manage, optimize, and administer a CDH cluster using the Cloudera Manager
  • How to install third party packages that are supported on the CDH Cluster.
  • Configuration for various Hadoop services, and how to define advanced configurations for each service

And you'll be able to:

  • Manage configuration changes of CDH cluster using Cloudera Manager with cluster growth.
  • Improve efficiency for debugging issues on a CDH cluster using Cloudera Manager.

This training course is for you because...

  • You are a Hadoop Architect looking for a deeper knowledge in Cloudera Manager.
  • You are a Hadoop Administrator looking to master the usage of different capabilities in Cloudera Manager.

Prerequisites

  • You'll need a Cloudera VM installed on your machine prior to the class.
  • You can find information about downloading and installing Cloudera VM here. Please ensure you download the Enterprise version.

About your instructor

  • Amit Rustagi is a Big Data expert with many years of industry experience in using Hadoop for machine learning and processing very large datasets in semiconductor manufacturing, ecommerce, and Web Analytics. In the past, Amit was a Big Data Technologist at Yahoo! and eBay for building Large Scale distributed processing systems.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Day 1:

  • Cloudera Manager Overview (5 Min)
  • Cloudera Manager Features (15 Min)
  • Cloudera Manager Architecture (10 Min)
  • Understanding Configuration of a service (10 Min)
  • Dynamic and Static allocation of resources on a CDH cluster (10 Min)
  • Break (10 Min)
  • Managing Users and Roles (10 Min)
  • Configuring HDFS (20 Min)
  • Configuring YARN (20 Min)
  • Configuring Spark and HBASE (40 Min)
  • Setting up Flume configuration (30 Min)
  • Q&A (10 Min)

Day 2:

  • Setting up HIVE and Impala configuration (30 Min)
  • Setting up Kerberos security (30 Min)
  • Setting up replication and snapshots using Cloudera Manager (20 Min)
  • Cluster Management using Cloudera Manager (10 Min)
  • Break (10 Min)
  • Monitoring applications, jobs, and tasks (10 Min)
  • Setting up alerts in Cloudera Manager (20 Min)
  • Using Logs and Usage for monitoring (20 Min)
  • Using Chart Builder for monitoring (20 Min)
  • Q&A (10 Min)