How it works...

A matrix block will be defined as a tuple of (int, int, Matrix). What is unique about this matrix is that it has Add() and Multiply() functions that can take another BlockMatrix as a second parameter to the distributed matrix. While setting it up is a bit confusing at first (especially on-the-fly as data arrives), there are helper functions that can help you verify your work and make sure the BlockMatrix is set up properly. This type of matrix can be converted to a local, IndexRowMatrix, and CoordinateMatrix. One of the most common use cases for the BlockMatrix is to have a BlockMatrix of CoordinateMatrices.

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.