O'Reilly logo

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Extending Pig (UDFs)

Functions can be a part of almost every operator in Pig. There are two main differences between UDFs and built-in functions. First, UDFs need to be registered using the REGISTER keyword in order to make them available to Pig. Secondly, they need to be qualified when used. Pig UDFs can currently be implemented in Java, Python, Ruby, JavaScript, and Groovy. The most extensive support is provided for Java functions, which allow you to customize all parts of the process including data load/store, transformation, and aggregation. Additionally, Java functions are also more efficient because they are implemented in the same language as Pig and because additional interfaces are supported, such as the Algebraic and Accumulator interfaces. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required