Using linear regression

Linear regression is the approach to model the value of a response variable y, based on one or more predictor variables or feature x.

Getting ready

Let's use some housing data to predict the price of a house based on its size. The following are the sizes and prices of houses in the City of Saratoga, CA, in early 2014:

House size (sq ft)

Price

2100

$ 1,620,000

2300

$ 1,690,000

2046

$ 1,400,000

4314

$ 2,000,000

1244

$ 1,060,000

4608

$ 3,830,000

2173

$ 1,230,000

2750

$ 2,400,000

4010

$ 3,380,000

1959

$ 1,480,000

Here's a graphical representation of the same:

Getting ready

How to do it…

  1. Start the Spark shell:
    $ spark-shell ...

Get Spark Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.