O'Reilly logo

Data Just Right: Introduction to Large-Scale Data & Analytics by Michael Manoochehri

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

8. Putting it Together: MapReduce Data Pipelines

It’s kind of fun to do the impossible.

Walt Disney

Human brains aren’t very good at keeping track of millions of separate data points—but what we do know is that there is lots of data out there, just waiting to be collected, analyzed, and visualized. To cope with the complexity, we create metaphors to wrap our heads around the problem. Need to store millions of records until we figure out what to do with them? Let’s file them away in a data warehouse. Need to analyze a billion data points? Let’s crunch it down into something more manageable.

No longer should we be satisfied with just storing data and chipping away little bits of it to study. Now that distributed computational tools are becoming ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required