Twitter graph topology

The Twitter graph topology will read raw tweet data from the Kafka queue, parse out the relevant information, and then create nodes and relationships in the Titan graph database. Instead of writing to the graph database individually for each tuple received, we will implement a trident state implementation for performing persistence operations in bulk using Trident's transaction mechanism.

This approach offers several benefits. First, for graph databases, such as Titan that supports transactions, we can leverage this capability to provide additional exactly-once processing guarantees. Second, it allows us to perform a bulk-write followed by a bulk-commit (when supported) for an entire batch of tuples rather than a write-commit ...

Get Storm Blueprints: Patterns for Distributed Real-time Computation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.