Call Data Record Analytics using Hive

Call Data Records (CDR) are special types of records that are used in the telecom domain to keep track of calls made by individuals. We can use Hive to analyze these records in order to give special offers to customers.

Note

You can read more about CDR at https://en.wikipedia.org/wiki/Call_detail_record.

Getting ready

To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Hive installed on it. Here, I am using Hive 1.2.1.

How to do it...

First of all, let's consider a situation where we have the following type of dataset with us. To analyze it, we first need to create a Hive table and load data into it:

CALLER_PHONE_NO|RECEIVER_PHONE_NUMBER|START_TIME|END_TIME|CALL_TYPE ...

Get Hadoop Real-World Solutions Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.