Avoiding Data Loss - 避免Kafka数据丢失
If for some reason the producer cannot deliver messages that have been consumed and committed by the consumer, it is possible for a MirrorMaker process to lose data.
To prevent data loss, use the following settings. (Note: these are the default settings.)
-
For consumers:
-
auto.commit.enabled=false
-
-
For producers:
-
max.in.flight.requests.per.connection=1
-
retries=Int.MaxValue
-
acks=-1
-
block.on.buffer.full=true
-
-
Specify the
--abortOnSendFail
option to MirrorMaker
The following actions will be taken by MirrorMaker:
-
MirrorMaker will send only one request to a broker at any given point.
-
If any exception is caught in the MirrorMaker thread, MirrorMaker will try to commit the acked offsets and then exit immediately.
-
On a
RetriableException
in the producer, the producer will retry indefinitely. If the retry does not work, MirrorMaker will eventually halt when the producer buffer is full. -
On a non-retriable exception, if
--abort.on.send.fail
is specified, MirrorMaker will stop.If
--abort.on.send.fail
is not specified, the producer callback mechanism will record the message that was not sent, and MirrorMaker will continue running. In this case, the message will not be replicated in the target cluster.