O'Reilly logo
  • Ashish Kaushal thinks this is interesting:

Next, the data is sent to a partitioner. If we specified a partition in the ProducerRecord, the partitioner doesn’t do anything and simply returns the partition we specified. If we didn’t, the partitioner will choose a partition for us, usually based on the ProducerRecord key. Once a partition is selected, the producer knows which topic and partition the record will go to. It then adds the record to a batch of records that will also be sent to the same topic and partition. A separate thread is responsible for sending those batches of records to the appropriate Kafka brokers.


Cover of Kafka: The Definitive Guide


There are few things that are not so clear in this paragraph: Clear - ProducerRecord Must have Topic | Data Clear - ProducerRecord May have Key | Partition Unclear - It says "next the data is sent to partitioner" The whole ProducerRecord is sent to Partitioner (another entity)? Clear - Partitioner checks for partition, if partition specified it returns that partition; if not specified it picks a partition for Producer and returns that partition Stmt - Once the Producer has the partition and topic (which it already had) then Producer adds the record to batches of records that will be sent to same Topic and Partition. A separate thread will find the correct Broker that has that Partition and Broker and dispatch the records to...