Summary

In this chapter, we provided the reader with a high-level definition of Hadoop, including some fun Hadoop FAQs. We mentioned that simply reaching MS Excel limitations doesn't mean that you are actually dealing with big data and used simple examples of R programming scripts to actually manipulate and visualize that same data that would not load in Excel to prove that point.

We then introduced the Amazon AWS environment as a simple, affordable, yet robust solution for leveraging the technology and power of Hadoop. We stepped through the process configuring that environment for our use, uploading our multiple web log files to it, and then used Hive and its query language (HiveQL) to access and manipulate that data to accomplish the same objectives ...

Get Big Data Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.