Understanding motifs

To easily understand the complex relationship of city airports and the flights between each other, we can use motifs to find patterns of airports (for example, vertices) connected by flights (that is, edges). The result is a DataFrame in which the column names are given by the motif keys. Note that motif finding is one of the new graph algorithms supported as part of GraphFrames.

For example, let's determine the delays that are due to San Francisco International Airport (SFO):

# Generate motifs
motifs = tripGraphPrime.find("(a)-[ab]->(b); (b)-[bc]->(c)")\
  .filter("(b.id = 'SFO') and (ab.delay > 500 or bc.delay > 500) and bc.tripid > ab.tripid and bc.tripid < ab.tripid + 10000")

# Display motifs
display(motifs)

Breaking down the ...

Get Learning PySpark now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.