librdkafka icon indicating copy to clipboard operation
librdkafka copied to clipboard

Do not allocate more memory than configured

Open daib opened this issue 3 years ago • 2 comments

When decompressed batch sizes are big, e.g., 200MB, the memory allocated for the buffer to store the decompressed batch sizes can become 600MB based on this buffer growth formular: out_bufsize += RD_MAX(out_bufsize * 2, 4000);. This can happen easily especially with zstd when the compression ratio is very good.

If we multiply 600MB by the number of partitions one consumer need to handle, e.g., 30 partitions, then the total memory usage can become 18GB. This is a huge amount of memory used to store decompressed batches. This can lead to a big overshoot for memory usage in librdkafka and potentially leads to out of memory issues.

We should not allocate more memory than configured.

daib avatar Aug 23 '22 21:08 daib

This is in stark contrast to #3935 :)

edenhill avatar Oct 03 '22 12:10 edenhill

This is in stark contrast to #3935 :)

I think the problem here is that, librdkafka allocates more memory than specified by the config and creates a problem for us. For example, if the message size limit is 300MB, and because of the way librdkafka allocates memory, it can allocates almost 900MB just do decode 300MB messages. This is an inefficient way to handle memory. At DataDog, we have had lots of problem out of memory pods because the zstd compression is very code.

Removing the limit for a small system is fine, but for a data intensive system, it is very dangerous.

daib avatar Oct 03 '22 20:10 daib

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

cla-assistant[bot] avatar Aug 21 '23 15:08 cla-assistant[bot]