Processing large chunks of data has been a major challenge in architecting modern day web applications. Hadoop is an open source framework from Apache that provides libraries to process and store large chunks of data. It offers a scalable, cost-effective, and fault-tolerant solution to store and process large chunks of data. In this chapter, let us demonstrate how the Spring Framework supports Hadoop. Map and Reduce, Hive, and HDFS are some of the Hadoop key terminology used with cloud-based technologies. Google has also come with its own Map and Reduce and distributed file system framework, apart from Apache Hadoop.
Apache Hadoop consists of the following modules: