Even though Mapreduce provides a powerful way to process large amounts of data, it is restricted due to several drawbacks:
- Lack of support for variety of operators
- Real-time data processing
- Caching the results of data for faster iterations
This is to name a few. Since Apache Spark was built from the ground up, it has approached the big data computation problem in a very generic way and has provided the developers with data structures that makes it easier to represent any type of data and use those to compute in a better way.