Summary

In this chapter, first you got an insight into Spark abstractions, data modalities, and how different data types can be read into a Spark environment. Then, you saw how to load data from a variety of different sources. We also looked at the basic parsing of data from text input files. Now that we can get our data loaded into a Spark RDD, it is time to explore the different operations we can perform on our data in the next chapter.

Get Fast Data Processing with Spark 2 - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.