Storing and processing Hive data in a sequential file format

I'm sure that most of the time, you would have created Hive tables and stored data in a text format; in this recipe, we are going to store data in sequential files.

Getting ready

To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Hive installed on it. Here, I am going to use Hive 1.2.1.

How to do it...

Hive 1.2.1 supports various different types of files, which help process data in a faster manner. In this recipe, we are going to use sequential files to store data in Hive. To store data in sequential files, we first need to create a Hive table that stores the data in a textual format:

create table employee( id int, name string) row format delimited ...

Get Hadoop Real-World Solutions Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.