confluent-kafka-go
confluent-kafka-go copied to clipboard
Producer.Close() hangs indefinitely when there's unflushed and unprocessed stuff remaining in producer
Description
Producer.Close()
hanged indefinitely (I waited 10 mins).
Before calling Close
, I called Flush
with the limit of 10,000 ms. Its return value was 186,962 (this many remaining unflushed upon return).
Also, as the Close
call was hanging, Producer.Len()
would return 186,431.
How to reproduce
Don't have a clean method, but I have a code that was producing nearly 70K messages per second (based on Kafka's metrics) to a topic with two partitions when this error occurred.
Checklist
Please provide the following information:
- [x] confluent-kafka-go and librdkafka version (
LibraryVersion()
):17040127 1.4.2
- [x] Apache Kafka broker version: Docker image
docker.io/bitnami/kafka:2-debian-10
(kafka_2.12-2.6.0) - [x] Client configuration:
ConfigMap{...}
:"acks": 1, "bootstrap.servers": "..."
- [x] Operating system: Linux
$ uname -a
Linux f0634a3cfcda 5.4.53.1.amd64-smp #1 SMP Fri Jul 24 10:28:09 CEST 2020 x86_64 GNU/Linux
$ go version
go version go1.15 linux/amd64
- [ ] Provide client logs (with
"debug": ".."
as necessary) - [x] Provide broker log excerpts: Nothing special --
Flush
andClose
were called at about 21:08
kafka_1 | [2020-09-27 21:06:19,674] INFO [GroupMetadataManager brokerId=1001] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
kafka_1 | [2020-09-27 21:07:00,045] INFO [ProducerStateManager partition=topic-1] Writing producer snapshot at offset 1956018171 (kafka.log.ProducerStateManager)
kafka_1 | [2020-09-27 21:07:00,342] INFO [Log partition=topic-1, dir=/data/01] Rolled new log segment at offset 1956018171 in 2196 ms. (kafka.log.Log)
kafka_1 | [2020-09-27 21:07:11,468] INFO [ProducerStateManager partition=topic_name-0] Writing producer snapshot at offset 4789613528 (kafka.log.ProducerStateManager)
kafka_1 | [2020-09-27 21:07:11,778] INFO [Log partition=topic_name-0, dir=/data/00] Rolled new log segment at offset 4789613528 in 2607 ms. (kafka.log.Log)
kafka_1 | [2020-09-27 21:08:08,054] INFO [ProducerStateManager partition=topic_name-1] Writing producer snapshot at offset 1957162306 (kafka.log.ProducerStateManager)
kafka_1 | [2020-09-27 21:08:08,327] INFO [Log partition=topic_name-1, dir=/data/01] Rolled new log segment at offset 1957162306 in 1949 ms. (kafka.log.Log)
kafka_1 | [2020-09-27 21:08:14,160] INFO [ProducerStateManager partition=topic_name-0] Writing producer snapshot at offset 4790755843 (kafka.log.ProducerStateManager)
kafka_1 | [2020-09-27 21:08:14,319] INFO [Log partition=topic_name-0, dir=/data/00] Rolled new log segment at offset 4790755843 in 1175 ms. (kafka.log.Log)
kafka_1 | [2020-09-27 21:16:19,670] INFO [GroupMetadataManager brokerId=1001] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
kafka_1 | [2020-09-27 21:20:44,684] INFO [ProducerStateManager partition=topic_name-1] Writing producer snapshot at offset 1958297907 (kafka.log.ProducerStateManager)
kafka_1 | [2020-09-27 21:20:44,811] INFO [Log partition=topic_name-1, dir=/data/01] Rolled new log segment at offset 1958297907 in 6952 ms. (kafka.log.Log)
- [ ] Critical issue: Idk 🤷
i am facing the same issue.
Same issue here. Am I not following the correct shutdown sequence?
Same issue here
Do you still have this issue with the latest version? If yes, can you please share the code for producer?
Please reopen with the above information if this is still happening in the latest version. There have been changes to the closing sequence recently.