Apache Spark powers a number of tools, both as a library and as an execution engine.
Spark Streaming (found at http://spark.apache.org/docs/latest/streaming-programming-guide.html) is an extension of the Scala API that allows data ingestion from streams such as Kafka, Flume, Twitter, ZeroMQ, and TCP sockets.
Spark Streaming receives live input data streams and divides the data into batches (arbitrarily sized time windows), which are then processed by the Spark core engine to generate the final stream of results in batches. This high-level abstraction is called DStream (
org.apache.spark.streaming.dstream.DStreams) and is implemented as a sequence of RDDs. DStream allows for two kinds of operations: transformations ...