O'Reilly logo

Programming Pig, 2nd Edition by Daniel Dai, Alan Gates

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 13. Use Cases and Programming Examples

In this chapter we will take a look at several comprehensive Pig examples and real-world Pig use cases.

Sparse Tuples

In “Schema Tuple Optimization” we introduced a more compact tuple implementation called the schema tuple. However, if your input data is sparse, a schema tuple is not the most efficient way to represent your data. You only need to store the position and value of nonempty fields of the tuple—which you can do with a sparse tuple. Since the vast majority of fields in the tuple will be empty, you can save a lot of space with this data structure. Sparse tuples are not natively supported by Pig. However, Pig allows users to define custom tuple implementations, so you can implement them by yourself. In this section, we will show you how to implement the sparse tuple and use it in Pig.

First, we will need to write a SparseTuple class that implements the Tuple interface. However, implementing all methods of the Tuple interface is tedious. To make it easier we derive SparseTuple from AbstractTuple, which already implements most common methods. Inside SparseTuple, we create a TreeMap that stores the index and value of each nonempty field. We also keep track of the size of the tuple. With both fields, we have the complete state of the sparse tuple. Here is the data structure along with the getter and setter methods of SparseTuple:

public class SparseTuple extends AbstractTuple {

    Map<Integer, Object> matrix = new TreeMap<Integer, Object ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required