Using Hive with Apache HBase
Hive is an ETL engine for HBase/Hadoop. It has an SQL-like query language, popularly known as Hive QA for SELECT(read) and INSERT(write). The main objective is to do ad hoc analysis on Petabyte-level data. Hive integration was originally introduced in HIVE-705.
Getting ready
- HBase and Hadoop cluster should be up and running.
Download Hive from https://archive.apache.org/dist/hive/hive-0.12.0/hive-0.12.0.tar.gz, or you can use this command:
wget –o
, https://archive.apache.org/dist/hive/hive-0.12.0/hive-0.12.0.tar.gzThis is if you are using the Linux command line.
- Untar it into the location, say
/u/HbaseB
- Hive uses an integration interface as HbaseStorageHandler, which enables Hive to talk to HBase (Hive projects need these ...
Get HBase High Performance Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.