Chapter 2. Loading Data from Various DBs

In this chapter, we will cover the following:

  • Extracting data from Oracle
  • Loading data using Oracle Big Data Connector
  • Bulk utilities
  • Using Hive and Apache Flume Streaming Data in Apache HBase
  • Using Sqoop

This will allow the actor to import data from different RDBMS/flat files.

Introduction

As we know, HBase is very effective in enabling real-time platforms to access read/write data randomly from the disk with commodity hardware, and there are many ways to do that, such as the following:

  • Put APIs
  • BulkLoad Tool
  • MapReduce jobs

Put APIs are the most straightforward way to place data into the HBase system, but they are only good for small sets of data and can be used for site-facing applications or for more real-time ...

Get HBase High Performance Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.