Creating a DataFrame from CSV
In this recipe, we'll look at how to create a new DataFrame from a delimiter-separated values file.
Note
The code for this recipe can be found at https://github.com/arunma/ScalaDataAnalysisCookbook/blob/master/chapter1-spark-csv/src/main/scala/com/packt/scaladata/spark/csv/DataFrameCSV.scala.
How to do it...
This recipe involves four steps:
- Add the
spark-csv
support to our project. - Create a Spark Config object that gives information on the environment that we are running Spark in.
- Create a Spark context that serves as an entry point into Spark. Then, we proceed to create an
SQLContext
from the Spark context. - Load the CSV using the
SQLContext
. - CSV support isn't first-class in Spark, but it is available through an external library ...
Get Scala Data Analysis Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.