Spark documentation for the textFile() and wholeTextFiles() functions:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.SparkContext
The textFile() API is a single abstraction for interfacing to external data sources. The formulation of protocol/path is enough to invoke the right decoder. We'll demonstrate reading from an ASCII text file, Amazon AWS S3, and HDFS with code snippets that the user would leverage to build their own system.
- The path can be expressed as a simple path (for example, local text file) to a complete URI with the required protocol (for example, s3n for AWS storage buckets) to complete resource path with server and port configuration (for example, to read HDFS file from a Hadoop cluster). ...