CHAPTER 1

image

The Problem with Data

The term “big data” refers to data sets so large and complex that traditional tools, like relational databases, are unable to process them in an acceptable time frame or within a reasonable cost range. Problems occur in sourcing, moving, searching, storing, and analyzing the data, but with the right tools these problems can be overcome, as you’ll see in the following chapters. A rich set of big data processing tools (provided by the Apache Software Foundation, Lucene, and third-party suppliers) is available to assist you in meeting all your big data needs.

In this chapter, I present the concept of big data and ...

Get Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.