confluent-kafka-go
confluent-kafka-go copied to clipboard
High memory usage when producing messages
Description
While load testing a microservice that uses a kafka producer, I noticed that the service uses rather high memory allocations.
When profiling the service, the following was discovered
The function call _Cfunc_GoBytes seems to use a lot of memory, is this normal?
How to reproduce
Put some load on the service that utilizes the kafka producer
Checklist
Please provide the following information:
- [ ] confluent-kafka-go and librdkafka version (
LibraryVersion()
): github.com/confluentinc/confluent-kafka-go v1.5.2 - [ ] Apache Kafka broker version:
- [ ] Client configuration:
config := &kafka.ConfigMap{ "metadata.broker.list": brokers, "security.protocol": "SASL_SSL", "sasl.mechanisms": "SCRAM-SHA-256", "sasl.username": username, "sasl.password": password, "acks": "all", "retries": 3, "batch.size": 9000000, //9mb "linger.ms": 5, "debug": "msg", }
- [ ] Operating system: ubuntu 20.04
- [ ] Provide client logs (with
"debug": ".."
as necessary) - [ ] Provide broker log excerpts
- [ ] Critical issue
I had same issue, because I do not check deliveryChan of Produce method.
Same problem here. Apparently we need to do one of 2 following things to avoid the problem:
- Provide a delivery channel on the Produce method and read from that channel afterwards
- Read the delivery event from the producer.Events() channel
If none of that is done by us (the users) then a "leak" is seen. Apparently, the amount of memory being used depends on the size of the messages being produced.
After our discussion and investigation, this would be a golang issue. Can you please try running your code with GODEBUG=madvdontneed=1?
Didn't try to run with GODEBUG=madvdontneed=1, but we change how we use the Producer.
We had a memory leak using the producer like this:
var deliveryChannel chan kafka.Event
if p.config.waitForDelivery() {
deliveryChannel = make(chan kafka.Event)
}
err = p.producer.Produce(&message, deliveryChannel)
if err != nil {
return fmt.Errorf("can't produce message: %w", err)
}
return p.waitForDelivery(deliveryChannel)
Reading the documentation we understood that we could use a delivery channel to listen for delivery messages and that, as an optional channel, there was no need to use it if we don't need to.
After many tests we changed it to ALWAYS read the events channel, and NEVER use the delivery channel:
err = p.producer.Produce(&message, nil)
if err != nil {
return fmt.Errorf("can't produce message: %w", err)
}
event := <-p.producer.Events()
That fixed the memory leak for us.
Sorry, but I don't see how that is an issue with golang, but for us is an issue with the API of this library.
This is a chart of memory usage of an application without the change and after the change, the only thing affecting the memory usage was the way we used this library:
I agree with @unkiwii . IMHO the documentation are not clear in the sense that they give the impression that the delivery channel is optional. Once it was treated as required and not optional then the memory leak disappeared
Thanks @unkiwii for the suggestion. We were also being affected by this behavior and refactoring the code to always read the events channel did the trick.