Using linear regression
Linear regression is the approach to model the value of a response variable y, based on one or more predictor variables or feature x.
Getting ready
Let's use some housing data to predict the price of a house based on its size. The following are the sizes and prices of houses in the City of Saratoga, CA, in early 2014:
House size (sq ft) |
Price |
---|---|
2100 |
$ 1,620,000 |
2300 |
$ 1,690,000 |
2046 |
$ 1,400,000 |
4314 |
$ 2,000,000 |
1244 |
$ 1,060,000 |
4608 |
$ 3,830,000 |
2173 |
$ 1,230,000 |
2750 |
$ 2,400,000 |
4010 |
$ 3,380,000 |
1959 |
$ 1,480,000 |
Here's a graphical representation of the same:
How to do it…
- Start the Spark shell:
$ spark-shell ...
Get Spark Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.