Streaming using Kafka

Kafka is a distributed, partitioned, and replicated commit log service. In simple words, it is a distributed messaging server. Kafka maintains the message feed in categories called topics. An example of the topic can be a ticker symbol of a company you would like to get news about, for example, CSCO for Cisco.

Processes that produce messages are called producers and those that consume messages are called consumers. In traditional messaging, the messaging service has one central messaging server, also called broker. Since Kafka is a distributed messaging service, it has a cluster of brokers, which functionally act as one Kafka broker, as shown here:

For each topic, Kafka maintains the partitioned log. This partitioned log consists ...

Get Spark Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.