Understanding vertex degrees

Within the context of graph theory, the degrees around a vertex are the number of edges around the vertex. In our flights example, the degrees are then the total number of edges (that is, flights) to the vertex (that is, airports). Therefore, if we were to obtain the top 20 vertex degrees (in descending order) from our graph, then we would be asking for the top 20 busiest airports (most flights in and out) from our graph. This can be quickly determined using the following query:

display(tripGraph.degrees.sort(desc("degree")).limit(20))

Because we're using the display command, we can quickly view a bar graph of this data:

Diving into more details, here are the top 20 inDegrees (that is, incoming flights):

display(tripGraph.inDegrees.sort(desc("inDegree")).limit(20)) ...

Get Learning PySpark now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.