You are previewing HBase Administration Cookbook.
O'Reilly logo
HBase Administration Cookbook

Book Description

"

Master HBase configuration and administration for optimum database performance with this book and ebook.

  • Move large amounts of data into HBase and learn how to manage it efficiently

  • Set up HBase on the cloud, get it ready for production, and run it smoothly with high performance

  • Maximize the ability of HBase with the Hadoop eco-system including HDFS, MapReduce, Zookeeper, and Hive

In Detail

As an Open Source distributed big data store, HBase scales to billions of rows, with millions of columns and sits on top of the clusters of commodity machines. If you are looking for a way to store and access a huge amount of data in real-time, then look no further than HBase.

HBase Administration Cookbook provides practical examples and simple step-by-step instructions for you to administrate HBase with ease. The recipes cover a wide range of processes for managing a fully distributed, highly available HBase cluster on the cloud. Working with such a huge amount of data means that an organized and manageable process is key and this book will help you to achieve that.

The recipes in this practical cookbook start from setting up a fully distributed HBase cluster and moving data into it. You will learn how to use all of the tools for day-to-day administration tasks as well as for efficiently managing and monitoring the cluster to achieve the best performance possible. Understanding the relationship between Hadoop and HBase will allow you to get the best out of HBase so the book will show you how to set up Hadoop clusters, configure Hadoop to cooperate with HBase, and tune its performance.

"

Table of Contents

  1. HBase Administration Cookbook
    1. HBase Administration Cookbook
    2. Credits
    3. About the Author
    4. Acknowledgement
    5. About the Reviewers
    6. www.PacktPub.com
      1. Support files, eBooks, discount offers and more
        1. Why Subscribe?
        2. Free Access for Packt account holders
    7. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    8. 1. Setting Up HBase Cluster
      1. Introduction
      2. Quick start
        1. Getting ready
        2. How to do it...
        3. How it works...
      3. Getting ready on Amazon EC2
        1. Getting ready
        2. How to do it...
          1. How it works...
      4. Setting up Hadoop
        1. Getting ready
        2. How to do it...
        3. How it works...
      5. Setting up ZooKeeper
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      6. Changing the kernel settings
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      7. Setting up HBase
        1. Getting ready
        2. How to do it...
        3. How it works...
      8. Basic Hadoop/ZooKeeper/HBase configurations
        1. How to do it...
          1. How it works...
          2. See also
      9. Setting up multiple High Availability (HA) masters
        1. Getting ready
        2. How to do it...
          1. Install and configure Heartbeat and Pacemaker
          2. Create and install a NameNode resource agent
          3. Configure highly available NameNode
          4. Start DataNode, HBase cluster, and backup HBase master
        3. How it works...
        4. There's more...
    9. 2. Data Migration
      1. Introduction
      2. Importing data from MySQL via single client
        1. Getting ready
        2. How to do it...
        3. How it works...
      3. Importing data from TSV files using the bulk load tool
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      4. Writing your own MapReduce job to import data
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
          1. Generating HFile files in MapReduce
          2. Important configurations affecting data migration
        5. See also
      5. Precreating regions before moving data into HBase
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
    10. 3. Using Administration Tools
      1. Introduction
      2. HBase Master web UI
        1. Getting ready
        2. How to do it...
        3. How it works...
      3. Using HBase Shell to manage tables
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      4. Using HBase Shell to access data in HBase
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      5. Using HBase Shell to manage the cluster
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      6. Executing Java methods from HBase Shell
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      7. Row counter
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      8. WAL tool—manually splitting and dumping WALs
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      9. HFile tool—viewing textualized HFile content
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      10. HBase hbck—checking the consistency of an HBase cluster
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      11. Hive on HBase—querying HBase using a SQL-like language
        1. Getting ready
        2. How to do it...
        3. How it works...
    11. 4. Backing Up and Restoring HBase Data
      1. Introduction
      2. Full shutdown backup using distcp
        1. Getting ready
        2. How to do it...
        3. How it works...
      3. Using CopyTable to copy data from one table to another
        1. Getting ready
        2. How to do it...
        3. How it works...
      4. Exporting an HBase table to dump files on HDFS
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      5. Restoring HBase data by importing dump files from HDFS
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      6. Backing up NameNode metadata
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      7. Backing up region starting keys
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      8. Cluster replication
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
    12. 5. Monitoring and Diagnosis
      1. Introduction
      2. Showing the disk utilization of HBase tables
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      3. Setting up Ganglia to monitor an HBase cluster
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      4. OpenTSDB—using HBase to monitor an HBase cluster
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      5. Setting up Nagios to monitor HBase processes
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      6. Using Nagios to check Hadoop/HBase logs
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      7. Simple scripts to report the status of the cluster
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      8. Hot region—write diagnosis
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
    13. 6. Maintenance and Security
      1. Introduction
      2. Enabling HBase RPC DEBUG-level logging
        1. Getting ready
        2. How to do it...
          1. How it works...
          2. There's more...
      3. Graceful node decommissioning
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      4. Adding nodes to the cluster
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      5. Rolling restart
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      6. Simple script for managing HBase processes
        1. Getting ready
        2. How to do it...
        3. How it works...
      7. Simple script for making deployment easier
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      8. Kerberos authentication for Hadoop and HBase
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      9. Configuring HDFS security with Kerberos
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      10. HBase security configuration
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
    14. 7. Troubleshooting
      1. Introduction
      2. Troubleshooting tools
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      3. Handling the XceiverCount error
        1. Getting ready
        2. How to do it...
        3. How it works...
      4. Handling the "too many open files" error
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      5. Handling the "unable to create new native thread" error
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      6. Handling the "HBase ignores HDFS client configuration" issue
        1. Getting ready
        2. How to do it...
        3. How it works...
      7. Handling the ZooKeeper client connection error
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      8. Handling the ZooKeeper session expired error
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      9. Handling the HBase startup error on EC2
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
    15. 8. Basic Performance Tuning
      1. Introduction
      2. Setting up Hadoop to spread disk I/O
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      3. Using network topology script to make Hadoop rack-aware
        1. Getting ready
        2. How to do it...
        3. How it works...
      4. Mounting disks with noatime and nodiratime
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      5. Setting vm.swappiness to 0 to avoid swap
        1. Getting ready
        2. How it works...
        3. See also
      6. Java GC and HBase heap settings
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      7. Using compression
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      8. Managing compactions
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      9. Managing a region split
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
    16. 9. Advanced Configurations and Tuning
      1. Introduction
      2. Benchmarking HBase cluster with YCSB
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
      3. Increasing region server handler count
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      4. Precreating regions using your own algorithm
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      5. Avoiding update blocking on write-heavy clusters
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. See also
      6. Tuning memory size for MemStores
        1. Getting ready
        2. How to do it...
        3. How it works...
        4. There's more...
        5. See also
      7. Client-side tuning for low latency systems
        1. Getting ready
        2. How to do it...
          1. How it works...
          2. There's more...
      8. Configuring block cache for column families
        1. Getting ready
        2. How to do it...
          1. How it works...
          2. There's more...
          3. See also
        3. Increasing block cache size on read-heavy clusters
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. See also
      9. Client side scanner setting
        1. Getting ready
        2. How to do it...
          1. How it works...
          2. There's more...
          3. See also
      10. Tuning block size to improve seek performance
        1. Getting ready
        2. How to do it...
          1. How it works...
          2. There's more...
          3. See also
      11. Enabling Bloom Filter to improve the overall throughput
        1. Getting ready
        2. How to do it...
          1. How it works...
          2. There's more...