References

DataFrame reference on the SQL programming guide of Apache Spark official resource:

Databricks: Introducing DataFrames in Apache Spark for Large Scale Data Science:

Databricks: From Pandas to Apache Spark's DataFrame:

API reference guide on Scala for Spark DataFrames:

A Cloudera blogpost on Parquet - an efficient general-purpose columnar file format for Apache Hadoop:

Get Spark for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.