O'Reilly logo
  • Alex Buzunov thinks this is interesting:

The MapReduce framework was used in cooperation with a distributed filesystem, the Google File System (GFS or GoogleFS), which was designed to partition and replicate data across the computing cluster. Partitioning was useful for storing and processing datasets that wouldn't fit on a single node while replication ensured that the system was able to handle failures gracefully. MapReduce was used by Google, in conjunction with GFS, for indexing of their web pages. Later on, the MapReduce and GFS concepts were implemented by Doug Cutting (at the time, an employee at Yahoo!), resulting in the first versions of the Hadoop Distributed File System (HDFS) and Hadoop MapReduce.

From

Cover of Python High Performance - Second Edition

Note

Should it be "with distributed filesystem" (without "a")?