O'Reilly logo

Hadoop in Practice by Alex Holmes

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 10. Hacking with Hive

 

This chapter covers
  • Learning how serialization and deserialization works in Hive
  • Writing a UDF to use the distributed cache
  • Optimizing your joins for faster query execution times
  • Using the EXPLAIN command to understand how Hive is planning your work

 

Working with MapReduce is nontrivial and has a steep learning curve, even for Java programmers. Over the course of the next three chapters, we’ll look at technologies that lower the barrier of entry to MapReduce.

Let’s say that it’s nine o’clock in the morning and you’ve been asked to generate a report on the top ten countries that generated visitor traffic over the last month. And it needs to be done by noon. Your log data is sitting in HDFS ready to be used. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required