You are previewing HBase Essentials.
O'Reilly logo
HBase Essentials

Book Description

A practical guide to realizing the seamless potential of storing and managing high-volume, high-velocity data quickly and painlessly with HBase

In Detail

With an example-oriented approach, this book begins by providing you with a step-by-step learning process to effortlessly set up HBase clusters and design schemas. Gradually, you will be taken through advanced data modeling concepts and the intricacies of the HBase architecture. Moreover, you will also get acquainted with the HBase client API and HBase shell. Essentially, this book aims to provide you with a solid grounding in the NoSQL columnar database space and also helps you take advantage of the real power of HBase using data scans, filters, and the MapReduce framework. Most importantly, the book also provides you with practical use cases covering various HBase clients, HBase cluster administration, and performance tuning.

What You Will Learn

  • Realize the need for HBase
  • Download and set up HBase cluster
  • Grasp data modeling concepts in HBase and how to perform CRUD operations on data
  • Perform effective data scanning and data filtration in HBase
  • Understand data storage and replication in HBase
  • Explore HBase counters, coprocessors, and MapReduce integration
  • Get acquainted with different clients of HBase such as REST and Kundera ORM
  • Learn about cluster management and performance tuning in HBase
  • Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

    Table of Contents

    1. HBase Essentials
      1. Table of Contents
      2. HBase Essentials
      3. Credits
      4. About the Author
      5. About the Reviewers
      6. www.PacktPub.com
        1. Support files, eBooks, discount offers, and more
          1. Why subscribe?
          2. Free access for Packt account holders
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      8. 1. Introducing HBase
        1. The world of Big Data
        2. The origin of HBase
        3. Use cases of HBase
        4. Installing HBase
          1. Installing Java 1.7
          2. The local mode
          3. The pseudo-distributed mode
          4. The fully distributed mode
        5. Understanding HBase cluster components
          1. Start playing
        6. Summary
      9. 2. Defining the Schema
        1. Data modeling in HBase
        2. Designing tables
        3. Accessing HBase
          1. Establishing a connection
          2. CRUD operations
            1. Writing data
            2. Reading data
            3. Updating data
            4. Deleting data
        4. Summary
      10. 3. Advanced Data Modeling
        1. Understanding keys
        2. HBase table scans
        3. Implementing filters
          1. Utility filters
          2. Comparison filters
          3. Custom filters
        4. Summary
      11. 4. The HBase Architecture
        1. Data storage
          1. HLog (the write-ahead log – WAL)
          2. HFile (the real data storage file)
        2. Data replication
        3. Securing HBase
          1. Enabling authentication
          2. Enabling authorization
          3. Configuring REST clients
        4. HBase and MapReduce
          1. Hadoop MapReduce
          2. Running MapReduce over HBase
          3. HBase as a data source
          4. HBase as a data sink
          5. HBase as a data source and sink
        5. Summary
      12. 5. The HBase Advanced API
        1. Counters
          1. Single counters
          2. Multiple counters
        2. Coprocessors
          1. The observer coprocessor
          2. The endpoint coprocessor
        3. The administrative API
          1. The data definition API
            1. Table name methods
            2. Column family methods
            3. Other methods
          2. The HBaseAdmin API
        4. Summary
      13. 6. HBase Clients
        1. The HBase shell
          1. Data definition commands
          2. Data manipulation commands
          3. Data-handling tools
        2. Kundera – object mapper
          1. CRUD using Kundera
          2. Query HBase using Kundera
          3. Using filters within query
        3. REST clients
          1. Getting started
            1. The plain format
            2. The XML format
            3. The JSON format (defined as a key-value pair)
            4. The REST Java client
        4. The Thrift client
          1. Getting started
        5. The Hadoop ecosystem client
          1. Hive
        6. Summary
      14. 7. HBase Administration
        1. Cluster management
          1. The Start/stop HBase cluster
          2. Adding nodes
          3. Decommissioning a node
          4. Upgrading a cluster
          5. HBase cluster consistency
          6. HBase data import/export tools
          7. Copy table
        2. Cluster monitoring
          1. The HBase metrics framework
            1. Master server metrics
            2. Region server metrics
            3. JVM metrics
            4. Info metrics
            5. Ganglia
            6. Nagios
            7. JMX
            8. File-based monitoring
        3. Performance tuning
          1. Compression
            1. Available codecs
          2. Load balancing
          3. Splitting regions
          4. Merging regions
          5. MemStore-local allocation buffers
          6. JVM tuning
          7. Other recommendations
        4. Troubleshooting
        5. Summary
      15. Index