Summary

In this chapter, we explained various machine learning algorithms, how they are implemented in the MLlib library and how they can be used with the pipeline API for a streamlined execution. The concepts were covered with Python and Scala code examples for a ready reference.

In the next chapter, we will discuss how Spark supports R programming language focusing on some of the algorithms and their executions similar to what we covered in this chapter.

Get Spark for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.