Developing a real-time streaming pipeline with Storm

In this section, we will create the following three pipelines:

  • Streaming pipeline with Kafka - Storm - MySQL
  • Streaming pipeline with Kafka - Storm - HDFS - Hive

In this section, we will see how data streams flow from Kafka to Storm to MySQL table.

The whole pipeline will work as follows:

  1. We will ingest customer records (customer_firstname and customer_lastname) in Kafka using the Kafka console-producer API.
  2. After that, Storm will pull the messages from Kafka.
  3. A connection to MySQL will be established.
  4. Storm will use MySQL-Bolt to ingest records into MySQL table. MySQL will automatically generate customer_id.
  5. The MySQL table data (customer_id, customer_firstname, and customer_lastname ...

Get Modern Big Data Processing with Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.