We cover both Gradient Descent and SGD in detail in Chapter 9, Optimization - Going Down the Hill with Gradient Descent. In this chapter, the reader should abstract the SGD as an optimization technique that minimizes the loss function for fitting a line to a series of points. There are parameters that will affect the behavior of the SGD and we encourage the reader to change these parameters to either extreme to observe poor performance and non-convergence (that is, the result will appear as NaN).
The documentation for the LinearRegressionWithSGD() constructor is available at the following URL:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package