Gradient Descent is an iterative numerical method that starts from an initial guess and then asks itself how badly am I doing by looking at an error function that is the squared distance of predicted versus actual data in the training file.
In this program, we selected a simple linear line f(x) = b + mx equation as our model. To optimize and come up with the best combination of slope m, intercept b for our model, we had 52 actual pairs of data (age, salary) that we can plug into our linear model (Predicted Salary = Slope x Age + Intercept). In short, we wanted to find the best combination of the slope and intercept that helped us fit a linear line that minimizes the squared distance. The squared function gives us all positive ...