CHAPTER 4

The Evolution of Analytic Scalability

It goes without saying that the world of big data requires new levels of scalability. As the amount of data organizations process continues to increase, the same old methods for handling data just won’t work anymore. Organizations that don’t update their technologies to provide a higher level of scalability will quite simply choke on big data. Luckily, there are multiple technologies available that address different aspects of the process of taming big data and making use of it in analytic processes. Some of these advances are quite new, and organizations need to keep up with the times.

In this chapter, we’ll discuss important technologies that make progress in the quest to tame the big data tidal wave possible. We’ll discuss the convergence of the analytic and data environments, massively parallel processing (MPP) architectures, the cloud, grid computing, and MapReduce.

Before we start, remember that the intent of this book is not to get too technical. This chapter, along with Chapters 5 and 6, will be the most technical chapters, but the topics are covered at a conceptual level so that it is possible to understand the concepts even if you aren’t technical. In keeping with this goal, it was necessary to take some liberties with over-simplification at times. If more detail is of interest, there are other books that can go as far into the technical weeds as desired!

A HISTORY OF SCALABILITY

Until well into the 1900s, doing analytics ...

Get Taming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.