Querying the graph with Gremlin

To query the graph, we need to launch the Gremlin shell and create a TitanGraph instance connected to the local Cassandra backend:

$ cd titan
$ ./bin/gremlin.sh
          \,,,/
         (o o)
-----oOOo-(_)-oOOo-----
gremlin> conf = new BaseConfiguration()
gremlin> conf.setProperty('storage.backend', 'cassandra')
gremlin> conf.setProperty('storage.hostname', 'localhost')
gremlin> g = TitanFactory.open(conf)

The g variable now contains a Graph object we can use to issue graph traversal queries. The following are a few sample queries you can use to get started:

  • To find all the users who have tweeted #hadoop hashtag and to show the number of times they have done this, use the following code:
    gremlin> g.V('type', 'hashtag').has('value', 'hadoop').in.userid.groupCount.cap ...

Get Storm Blueprints: Patterns for Distributed Real-time Computation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.