Summary

In this chapter, we went through the advanced features of Pig. We looked into the optimizations that Pig has to offer. The following are a few key takeaways from this chapter:

  • As a rule, try to use Pig in as many situations as you can. Pig's abstractions, development aids, and flexibility can save you both time and money. Stretch Pig's capabilities before reverting to MapReduce jobs.
  • The logical plan optimizations might change the order of statement execution. Use EXPLAIN and ILLUSTRATE extensively to study Pig scripts.
  • Help Pig to execute your script faster by following some of the guidelines mentioned in this chapter. Try to make your UDFs implement the Algebraic or Accumulator interface, ideally both.
  • Understand the data you are trying ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.