aiokafka
aiokafka copied to clipboard
Severe performance degradation after migrating to cramjam LZ4
Describe the bug
In #960, when switching compression libraries, the default compression settings for LZ4 were also changed. In python-lz4, the default level was 0. Now it is 9.
We have seen an over 100% increase in producer latency in a production application due to this change. It happens to publish a relatively large binary payload. We are faced with a decision to either downgrade back to 0.10.0 or explicitly monkeypatch codec.lz4_compress.
Generally speaking, folks who are deliberately choosing to install lz4 when Standard Library natively supports others are opting for faster/weaker compression. From the link included in the function documentation:
the choice of going from 0/1 -> 9 yields a 45% latency increase for a 1.2% improvement in compression ratio. I don't think folks are signing up for this trade-off.
Expected behaviour
Our "expected" behavior is that, since one cannot explicitly control compression settings and only compression.type, that settings are identical to they were when aiokafka directly used python-lz4. Could we consider reverting the default level to 0 to match the original behavior pre-0.11.0?
We also have a problem.
aiokafka migrated to cramjam lz4 since version 0.11.0.
With 0.11.0 we got an lz4-related runtime error in production and reverted back to aiokafka 0.10.0.
Since then, we are stuck with 0.10.0.
@ods Is there a way to use aiokafka with the former lz4 implementation?
@aisven If your issue is specifically related to the compressor level changing, you can consider monkey patching the value as we are:
aiokafka.codec.lz4_encode.__defaults__ = (0,)
But I would also appreciate reverting to the original default compression level.
@ods Thank you for your answer.
By the way, what was the motivation to use cramjam lz4 in aiokafka?
@aisven Sorry for long delay. aiokafka already used cramjam for other codecs and cramjam provides LZ4, so it's a way to remove one more dependency. If I understand correctly, the problem is not the implementation, but rather wrong default compression level (now it's the same as in broker itself) and missing a way to configure it.
It would be nice if someone provide a benchmark to justify changing the level back or to other lower lever.
Hey @ods, the screenshot that I provided in the original post is from the KIP itself and shows that the difference between level 1 and level 9 resulting in a drop from ~1700 msg/s to 1100 msg/s for their benchmark. I can also say that we are using this library in production and without overriding the current compression level, we observe significant impact.
Hi, any chance we can merge this?
Merge what exactly? There is no pull request with proposed changes
Sorry, to be clear, to revert the (likely unintended) breaking change that the compression level for LZ4 changed due to the migration of the library implementation to cramjam.
Changing compression level is not a breaking change. Code still works as expected, it just become slower.
I think the better solution will be either change default compression level to 1-3, or introduce new option compression_level which allow to pass custom values if required. Or both the changes.
BTW, recently CPython changed default complression level for gzip from 9 to 6: https://github.com/python/cpython/pull/131470 Motivated by: https://github.com/python/cpython/issues/91349#issuecomment-2737161048 As level=9 takes twice time of level=6, giving only -0.3% of size. Actually, level=3 is x3 faster than level=6 according to this benchmark.
Changing compression level is not a breaking change. Code still works as expected, it just become slower.
This is a pretty glib response. It was clear that the migration to cramjam was intended to simplify dependency management for the project, not to cause everyone's real-time systems to stop meeting their SLOs.
I think the better solution will be either change default compression level to 1-3, or introduce new option compression_level which allow to pass custom values if required. Or both the changes.
I'll defer to the maintainers here. Having the ability to override the compression settings would be nice, but this was not a feature even prior to the cramjam migration. Without a mechanism to override, changing the level to something like 1-3 sounds fine too.
In #1121 I've changed compression level to 1