Working with JSON using the Dataset API and SQL together

In this recipe, we explore how to use JSON with Dataset. The JSON format has rapidly become the de-facto standard for data interoperability in the last 5 years.

We explore how Dataset uses JSON and executes API commands like select(). We then progress by creating a view (that is, createOrReplaceTempView()) and then execute a SQL query to demonstrate how to query against a JSON file using API and SQL with ease.

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.