O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Twitter's Real-Time Data Stack

Video Description

This year, Twitter open sourced two powerful real-time analytics tools -- DistributedLog, a high-performance log service, and Heron, a distributed stream computation system.

A few weeks after Heron was open sourced, Karthik Ramasamy, engineering manager and technical lead for real-time analytics at Twitter, delivered a talk at Strata + Hadoop World in London to unveil the system and discuss:

  • An overview of Heron as a micro stream engine and its architectural components
  • How Twitter has been running Heron in production
  • The operational experience and challenges of running Heron at scale, including a discussion of stragglers
  • Heron's minimal resource usage and performance numbers

Leading up to Twitter's open sourcing of DistributedLog, software engineer and tech lead of the DistributedLog project Sijie Guo spoke at Strata + Hadoop World in San Jose to introduce the service. Key components of his talk include:

  • Why Twitter built DistributedLog
  • Technical decisions and challenges behind building DistributedLog
  • How Twitter uses DistributedLog to support different workloads
  • How Twitter runs the same software stack in multiple data centers to achieve global consistency