In the older version of Spark, we needed to use a special package to read in CSV, but we now can take advantage of spark.sparkContext.textFile(dataFile) to ingest the file. The Spark which starts the statement is the Spark session (handle to cluster) and can be named anything you like via the creation phase, as shown here:
val spark = SparkSession .builder.master("local[*]") .appName("MyCSV") .config("spark.sql.warehouse.dir", ".") .getOrCreate()spark.sparkContext.textFile(dataFile)spark.sparkContext.textFile(dataFile)
Spark 2.0+ uses spark.sql.warehouse.dir to set the warehouse location to store tables rather than hive.metastore.warehouse.dir. The default value for spark.sql.warehouse.dir is System.getProperty("user.dir") ...