How it works...

In this recipe, we will be utilizing the market data of closing prices for General Electric (GE) dating back to 1972. To simplify the data, we have preprocessed for the purposes of this recipe. We use the same method from the previous recipe, Streaming DataFrames for real-time machine learning, by peeking into the JSON object to discover the schema (step 7), which we impose on the stream in step 8.

The following code shows how to use the schema to make the stream look like a simple table that you can read from on the fly. This is a powerful concept that makes stream programming accessible to more programmers. The schema(s.schema) and as[StockPrice] from the following code snippet are required to create the streaming Dataset, ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.