Powerful Production Equipment

Lack of funds and lack of space helped the Google co-founders to build a unique, highly automated factory, using the latest means of distributed data processing. The entire Google network is based on a model for processing large data sets and dispatching tasks across a large cluster, using MapReduce and the Google File System.

MapReduce distributes tasks by running programs in parallel on a large cluster of commodity machines. The MapReduce system balances and manages program execution, allowing Google's programmers to utilize the resources of a large distributed system easily. According to Google, "a typical MapReduce computation processes many terabytes of data on thousands of machines."[85]

The Google File System ...

Get The Google Way now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.