Univariate linear regression in Apache Spark

Returning to our case study, let's develop a univariate linear regression model in Apache Spark using its machine learning library, MLlib, in order to predict the total daily bike renters using our bike sharing dataset:

The following sub-sections describe each of the pertinent cells in the corresponding Jupyter Notebook for this use case, entitled chp04-01-univariate-linear-regression.ipynb, and which may be found in the GitHub repository accompanying this book.
  1. First, we import the required Python dependencies, including pandas (Python data analysis library), matplotlib (Python plotting library), and pyspark (Apache Spark Python API). By using the %matplotlib magic function, any plots that we ...

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.