Writing a user-defined function in Hive

In the previous chapter, we talked about how to write user-defined functions in Pig; in this recipe, we are going to do the same for Hive. Hive supports the adding of temporary functions, which can be used to process data. We will be writing UDF in Java and will also create functions that can be used in data processing.

Getting ready

To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Hive installed on it. Here, I am using Hive 1.2.1. We will also need the Eclipse IDE for development.

How to do it

There are various system functions that are supported by Hive, but sometimes, you will need to do something different that cannot be handled by system provided functions. ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.