We start producing messages to Kafka by creating a ProducerRecord, which must
include the topic we want to send the record to and a value. Optionally, we can also
specify a key and/or a partition. Once we send the ProducerRecord, the first thing the
producer will do is serialize the key and value objects to ByteArrays so they can be
sent over the network.
Next, the data is sent to a partitioner. If we specified a partition in the
ProducerRecord, the partitioner doesn’t do anything and simply returns the partition
we specified. If we didn’t, the partitioner will choose a partition for us, usually based
on the ProducerRecord key. Once a partition is selected, the producer knows which
topic and partition the record will go to. It then adds the record to a batch of records
that will also be sent to the same topic and partition. A separate thread is responsible
for sending those batches of records to the appropriate Kafka brokers.
When the broker receives the messages, it sends back a response. If the messages
were successfully written to Kafka, it will return a RecordMetadata object with the
topic, partition, and the offset of the record within the partition. If the broker failed
to write the messages, it will return an error. When the producer receives an error, it
may retry sending the message a few more times before giving up and returning an
error
Once we instantiate a producer, it is time to start sending messages. There are three
primary methods of sending messages:
Fire-and-forget
We send a message to the server and don’t really care if it arrives succesfully or
not. Most of the time, it will arrive successfully, since Kafka is highly available
and the producer will retry sending messages automatically. However, some mes‐
sages will get lost using this method.
Synchronous send
We send a message, the send() method returns a Future object, and we use get()
to wait on the future and see if the send() was successful or not.
Asynchronous send
We call the send() method with a callback function, which gets triggered when it
receives a response from the Kafka broker.
acks
The acks parameter controls how many partition replicas must receive the record
before the producer can consider the write successful. This option has a significant
impact on how likely messages are to be lost. There are three allowed values for the
acks parameter: 0, 1, all
No comments:
Post a Comment