Kafka参数影响及性能测试_tom_fans的博客-CSDN博客
Kafka提供了2个测试脚本,kafka-producer-perf-test.sh以及kafka-consumer-perf-test.sh, kafka参数非常多,有些使用默认即可,有些对性能影响极大,只有经过测试,你才能够对这些参数有直观的感觉。 下面我们先测试producer.
先看看producer脚本怎么使用:
[hdfs@namenode02 tmp]$ /opt/cloudera/parcels/KAFKA/lib/kafka/bin/kafka-producer-perf-test.sh
usage: producer-performance [-h] --topic TOPIC --num-records NUM-RECORDS --record-size RECORD-SIZE --throughput THROUGHPUT
--producer-props PROP-NAME=PROP-VALUE [PROP-NAME=PROP-VALUE ...]
This tool is used to verify the producer performance.
optional arguments:
-h, --help show this help message and exit
--topic TOPIC produce messages to this topic
--num-records NUM-RECORDS
number of messages to produce
--record-size RECORD-SIZE
message size in bytes
--throughput THROUGHPUT
throttle maximum message throughput to *approximately* THROUGHPUT messages/sec
--producer-props PROP-NAME=PROP-VALUE [PROP-NAME=PROP-VALUE ...]
kafka producer related configuaration properties like bootstrap.servers,client.id etc..
[hdfs@namenode02 tmp]$
默认测试命令如下, 发送100000条记录,每个记录100 bytes
/opt/cloudera/parcels/KAFKA/lib/kafka/bin/kafka-producer-perf-test.sh --topic jlwang --num-records 1000000 --record-size 100 --producer-props bootstrap.servers=datanode04.isesol.com:9092 --throughput 1000000
由于默认参数没有去做修改,那么主要的几个参数如下:
buffer.memory = 33554432 这个就是消息缓存,producer发消息默认先发给buffer
block.on.buffer.full = false 如果发送的消息量太大,撑满了buffer怎么办? 我相信kafka会有清理 buffer的功能,但是如果即使清理也赶不到发送速度呢? 这个参数的
意义就是如果出现这个情况,是堵塞发送,还是报错?
request.timeout.ms = 30000
acks = 1
retries = 0
max.request.size = 1048576
linger.ms = 0
batch.size = 16384
接下来我们主要测试 batch, buffer, ack, linger.ms的影响。
默认:
1000000 records sent, 288184.438040 records/sec (27.48 MB/sec), 574.34 ms avg latency, 918.00 ms max
acks=all :
1000000 records sent, 121212.121212 records/sec (11.56 MB/sec), 1566.87 ms avg latency, 2640.00 ms max latency
acks=all, linger.ms=100ms :
1000000 records sent, 128188.693757 records/sec (12.23 MB/sec), 1506.37 ms avg latency, 1960.00 ms max latency
buffer.memory=100000 :
1000000 records sent, 66427.527567 records/sec (6.34 MB/sec), 1.06 ms avg latency, 307.00 ms max latency
batch.size=1, acks=1 :
16669 records sent, 3333.8 records/sec (0.32 MB/sec), 2587.5 ms avg latency, 4303.0 max latency.
随后报错:org.apache.kafka.common.errors.TimeoutException: Batch Expired 生产的数据速度远远超过发送速度,导致失败timeout,然后失败。
其实已经不用测了,上面这几个参数对整个发送性能都有相当大的影响, 如果发送量很大,可以考虑增加buffer, batch.size, linger.ms的值,acks设置为1. 至于设置多大,坦白说我觉得给个double就行了,也不需要太大。 如果发送量不大,其实默认值kafka给的很不错,可以应付大部分系统。
另外要提一点record.size也严重影响发送速度,生产上尽量避免太大的record.size, 看下面测试结果,我设置record.size=10000,速度严重不行
24499 records sent, 4899.8 records/sec (46.73 MB/sec), 364.1 ms avg latency, 748.0 max latency.
28500 records sent, 5700.0 records/sec (54.36 MB/sec), 346.4 ms avg latency, 742.0 max latency.
28134 records sent, 5626.8 records/sec (53.66 MB/sec), 363.0 ms avg latency, 806.0 max latency.
28037 records sent, 5607.4 records/sec (53.48 MB/sec), 362.7 ms avg latency, 821.0 max latency.
23201 records sent, 4640.2 records/sec (44.25 MB/sec), 429.9 ms avg latency, 1088.0 max latency.
17055 records sent, 3411.0 records/sec (32.53 MB/sec), 605.7 ms avg latency, 1361.0 max latency.
21415 records sent, 4283.0 records/sec (40.85 MB/sec), 490.0 ms avg latency, 1019.0 max latency.
26560 records sent, 5312.0 records/sec (50.66 MB/sec), 383.6 ms avg latency, 853.0 max latency.
23193 records sent, 4638.6 records/sec (44.24 MB/sec), 446.7 ms avg latency, 1225.0 max latency.
26156 records sent, 5231.2 records/sec (49.89 MB/sec), 387.6 ms avg latency, 1068.0 max latency.
28024 records sent, 5604.8 records/sec (53.45 MB/sec), 372.2 ms avg latency, 855.0 max latency.
27209 records sent, 5441.8 records/sec (51.90 MB/sec), 377.0 ms avg latency, 842.0 max latency.
对于consumer就不做具体测试了,主要是因为影响参数没那么多,receive.buffer.bytes,auto.offset.reset,max.partition.fetch.bytes,fetch.min.bytes,isolation.level,max.poll.interval.ms,receive.buffer.bytes,request.timeout.ms
估计真正会设置的几个参数也就这个,其他基本都不太用。