Spark GraphX in Action

Chapter 8. The missing algorithms

This chapter covers

Reading RDF files
Merging graphs
Filtering out isolated vertices
Using IndexedRDD for performance gains
Taking a simplistic approach to finding graph isomorphisms
Computing the global clustering coefficient

You’ve seen examples of reading graph data from edge list files in earlier chapters. RDF is another important file format used for many existing file formats. This chapter shows you how to read in this file format and use this knowledge to make use of the YAGO3 dataset.

Aside from the classic graph algorithms from chapter 6, there are other slightly more modern algorithms that one comes to expect in a graph database or graph processing system. Some of these are missing—not implemented ...

Get Spark GraphX in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Spark GraphX in Action by Robin East, Michael Malak

Chapter 8. The missing algorithms

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly