Chapter 12. GraphX

In this chapter, we will dive into the graph-processing capabilities of Spark, the GraphX package-very interesting, useful, and relevant. You will see things such as PageRank, connections, and communities. We will start with an introduction to graph processing and then progress to code the GraphX APIs on a simple, yet interesting giraffe graph. We will explore the organization and structure of the APIs and objects and then dive into algorithms that explore the community, PageRank, and so forth. Finally, we will explore the retweet network of the #alphago community, exploring the data pipeline, the map attributes of properties, and vertices and edges. We'll then create a graph and run algorithms. Should be an interesting chapter! ...

Get Fast Data Processing with Spark 2 - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.