How it works...

The core of the work gets done by declaring a RowMatrix() and then invoking the computeSVD() method to decompose the matrix into subcomponents that are much smaller, but approximate the original with uncanny accuracy:

valmat = new RowMatrix(rows)val col = 10 //number of leading singular valuesval computeU = trueval svd = mat.computeSVD(col, computeU)

SVD is a factorization technique for a real or complex matrix. At its core, it is a straight linear algebra which was actually derived from PCA itself. The concept is used extensively in recommender systems (ALS, SVD), topic modeling (LDA), and text analytics, to derive concepts from primitive high-dimensional matrices. Let's try to outline this without getting into the mathematical ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.