Writing a user-defined function in Pig

In this recipe, we will learn how to write user-defined functions (UDFs) in order to have our own custom filters.

Getting ready

To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Pig installed on it. We will also need an IDE, such as Eclipse, to write the Java class.

How to do it...

In this recipe, we are going to write user-defined functions for the dataset we have been considering in this chapter. Our dataset is an employee dataset, so let's assume that we want to convert all the names present in our dataset into uppercase. To do this, we will write a user-defined function to convert the lowercase letters into uppercase letters.

Writing a UDF is very simple: we ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.