Please take a note of some interesting points on datasets:
- Datasets use lazy evaluation
- Datasets take advantage of the Spark SQL Catalyst optimizer
- Datasets take advantage of the tungsten off-heap memory management
- There are plenty of systems that will remain pre-Spark 2.0 for the next 2 year so you must still learn and master RDDs and DataFrame for practical reasons.