Development and debugging aids
There are three important commands that can help develop, debug, and optimize Pig scripts.
The DESCRIBE command
The DESCRIBE
command gives the schema of a relation. This command is useful when you are a Pig Latin beginner and want to understand how operators transform the data. The output corresponding to the groupByCountry
relation in the previous script code to find the population of the country is given as follows:
groupByCountry: {group: chararray,generateRecords: {(cc::cname: chararray,ccity::cityName: chararray,ccity::population: long)}}
The DESCRIBE
output has the Pig syntax. In the preceding example, groupByCountry
is a Bag data type that contains a group element and another bag, generateRecords
.
The EXPLAIN ...
Get Mastering Hadoop now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.