To implement UDTF, there is only one method extending from org.apache.hadoop.hive.ql.exec.GenericUDTF. There is no plain UDTF class. We need to implement three methods: initialize(), process(), and close(). The UDTF will call the initialize() method, which returns the information of the function output, such as data type and number of output. Then, the process() method is called to perform core function logic with arguments and forward the result. Finally, the close() method will do a proper cleanup if needed. The code template for UDTF is as follows:
package com.packtpub.hive.essentials.hiveudtf; import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF; import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; ...