kafka-docker
kafka-docker copied to clipboard
[Bug]kafka auto log clean not working
Hi folks,
One problem I encountered is that Kafka's log files will continue to grow and will not be cleared automatically.
I used KAFKA_LOG_RETENTION_MS and KAFKA_LOG_RETENTION_BYTES in docker-compose file.
Are there any problems with these docker configs?
kafka1: restart: always image: wurstmeister/kafka:2.13-2.6.0 ports: - 9092:9092 environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181,zoo3:2181 KAFKA_ADVERTISED_HOST_NAME: 10.17.19.210 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://10.17.19.210:9092 KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092 KAFKA_CREATE_TOPICS: requests:100:1:delete --config=retention.ms=60000 --config=segment.bytes=26214400 --config=retention.bytes=104857600,tb_transport.api.requests:30:1:delete --config=retention.ms=60000 --config=segment.bytes=26214400 --config=retention.bytes=104857600 KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'false' KAFKA_LOG_RETENTION_BYTES: 1073741824 KAFKA_LOG_SEGMENT_BYTES: 268435456 KAFKA_LOG_RETENTION_MS: 300000 # KAFKA_LOG_CLEANER_ENABLE: 'true' KAFKA_LOG_CLEANUP_POLICY: delete kafka2: restart: always image: wurstmeister/kafka:2.13-2.6.0 ports: - 9093:9093 environment: KAFKA_BROKER_ID: 2 KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181,zoo3:2181 KAFKA_ADVERTISED_HOST_NAME: 10.17.19.210 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://10.17.19.210:9093 KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9093 KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'false' KAFKA_LOG_RETENTION_BYTES: 1073741824 KAFKA_LOG_CLEANER_ENABLE: 'true' KAFKA_LOG_SEGMENT_BYTES: 268435456 KAFKA_LOG_RETENTION_CHECK_INTERVAL_MS: 5000 #KAFKA_LOG_RETENTION_MS: 10000 KAFKA_LOG_CLEANUP_POLICY: delete
......
nobody meet this issue?
Are you referring to kafka's own log files? (server.log, controller.log, log-cleaner.log, etc..) Or are you referring to the topic logs? (requests-N, tb_transport.api.requests-N)
I ran a test with your configurations, and the messages i produced to the topic "requests" got deleted after 60000ms as expected:
[2022-03-16 13:33:10,738] INFO [ProducerStateManager partition=requests-0] Writing producer snapshot at offset 2 (kafka.log.ProducerStateM
anager)
[2022-03-16 13:33:10,748] INFO [Log partition=requests-0, dir=/kafka/data] Rolled new log segment at offset 2 in 24 ms. (kafka.log.Log)
[2022-03-16 13:33:10,749] INFO [Log partition=requests-0, dir=/kafka/data] Deleting segment LogSegment(baseOffset=1, size=132, lastModifie
dTime=1647437528000, largestRecordTimestamp=Some(1647437529695)) due to retention time 60000ms breach based on the largest record timestam
p in the segment (kafka.log.Log)
[2022-03-16 13:33:10,754] INFO [Log partition=requests-0, dir=/kafka/data] Incremented log start offset to 2 due to segment deletion (kafk
a.log.Log)
[2022-03-16 13:34:10,755] INFO [Log partition=requests-0, dir=/kafka/data] Deleting segment files LogSegment(baseOffset=1, size=132, lastM
odifiedTime=0, largestRecordTimestamp=Some(1647437529695)) (kafka.log.Log)
[2022-03-16 13:34:10,759] INFO Deleted log /kafka/data/requests-0/00000000000000000001.log.deleted. (kafka.log.LogSegment)
Are you referring to kafka's own log files? (server.log, controller.log, log-cleaner.log, etc..) Or are you referring to the topic logs? (requests-N, tb_transport.api.requests-N)
I ran a test with your configurations, and the messages i produced to the topic "requests" got deleted after 60000ms as expected:
[2022-03-16 13:33:10,738] INFO [ProducerStateManager partition=requests-0] Writing producer snapshot at offset 2 (kafka.log.ProducerStateM anager) [2022-03-16 13:33:10,748] INFO [Log partition=requests-0, dir=/kafka/data] Rolled new log segment at offset 2 in 24 ms. (kafka.log.Log) [2022-03-16 13:33:10,749] INFO [Log partition=requests-0, dir=/kafka/data] Deleting segment LogSegment(baseOffset=1, size=132, lastModifie dTime=1647437528000, largestRecordTimestamp=Some(1647437529695)) due to retention time 60000ms breach based on the largest record timestam p in the segment (kafka.log.Log) [2022-03-16 13:33:10,754] INFO [Log partition=requests-0, dir=/kafka/data] Incremented log start offset to 2 due to segment deletion (kafk a.log.Log) [2022-03-16 13:34:10,755] INFO [Log partition=requests-0, dir=/kafka/data] Deleting segment files LogSegment(baseOffset=1, size=132, lastM odifiedTime=0, largestRecordTimestamp=Some(1647437529695)) (kafka.log.Log) [2022-03-16 13:34:10,759] INFO Deleted log /kafka/data/requests-0/00000000000000000001.log.deleted. (kafka.log.LogSegment)
I mean the topic logs, that is the data actually stored. There are a lot of topics in my kafka cluster, not only "request" topic. How to ensure that other topics datas will be cleared?
In your case you create the requests
and tb_transport.api.requests
topics with specific configurations regarding retention.ms
and retention.bytes
.
Any other topics will be created with the cluster default settings unless you specify otherwise, and you can use the kafka commands to check these settings on a broker level, or on the specific topic:
https://stackoverflow.com/questions/35997137/how-do-you-get-default-kafka-configs-global-and-per-topic-from-command-line
You could also include a GUI which allows you to easily check the settings of a topic and adjust them if needed, I have had good experience using either Kafdrops or kafka-ui
In your case you create the
requests
andtb_transport.api.requests
topics with specific configurations regardingretention.ms
andretention.bytes
.Any other topics will be created with the cluster default settings unless you specify otherwise, and you can use the kafka commands to check these settings on a broker level, or on the specific topic:
https://stackoverflow.com/questions/35997137/how-do-you-get-default-kafka-configs-global-and-per-topic-from-command-line
You could also include a GUI which allows you to easily check the settings of a topic and adjust them if needed, I have had good experience using either Kafdrops or kafka-ui Got it, thank you.
@xiddjp can this issue be closed? 😃
In your case you create the
requests
andtb_transport.api.requests
topics with specific configurations regardingretention.ms
andretention.bytes
.Any other topics will be created with the cluster default settings unless you specify otherwise, and you can use the kafka commands to check these settings on a broker level, or on the specific topic:
https://stackoverflow.com/questions/35997137/how-do-you-get-default-kafka-configs-global-and-per-topic-from-command-line
You could also include a GUI which allows you to easily check the settings of a topic and adjust them if needed, I have had good experience using either Kafdrops or kafka-ui
how do I create the topics with specific configurations without using the command line?
We also encountered the same problem,Expired data on topic cannot be automatically deleted, we kafka version is 2.13-2.5.1. We have attempted to manually clear the data, shorten the retention time of topic, and then restart the broker, it's not work. and then we shorten the log.retention.hours=168 to 48 and manually clear the data, then restart the broker again, this time it takes effect
@xiddjp Have you solved this problem or have any more clues?