python-build-standalone
python-build-standalone copied to clipboard
LZMADecompressor is slow
Using uv python install to install python 3.12.3, decompress is ~2x slower than Ubuntu 24.04 python3 and also pyenv installed python. Same python version.
For example:
uv installed python:
$ python -V
Python 3.12.3
$ python decompress.py
Time taken to decompress: 0.001038 seconds
system python (from ubuntu):
$ /usr/bin/python -V
Python 3.12.3
$ /usr/bin/python decompress.py
Time taken to decompress: 0.000400 seconds
$ uv python list --only-installed
cpython-3.12.3-linux-aarch64-gnu /usr/local/uv/python/cpython-3.12.3-linux-aarch64-gnu/bin/python3 -> python3.12
Also tried the same thing on macOS, same python 3.12 installed with pyenv vs installed with uv and I noticed the same thing.
I made tests with decompress.py and with multiple other xz files.
Interesting. Thanks for the report. Any clue why this could be?
I have a hunch that the 2nd call from the same process will be faster and have similar performance as other builds.
Thanks for looking into this, we really need this fixed @commaai. We decompress a 800 MB xz update file and takes almost twice as much.
I tried figuring out the issue myself, but with no luck. If you give me a pointer on what to change, I may be able to fix it myself and PR it, but right now I have no idea.
If you care about decompression wall times you should consider the zstandard Python package. xz/lzma will give you really good compression ratios but the decompression speeds are usually vastly slower than zstd.
We were thinking about switching to Zstandard, but didn't wanted to change the image format right now. Since we need to change the code in not just one place (but three) and also validate everything. We can't update it right now, so we hope for a fix.
I have a hunch that the 2nd call from the same process will be faster and have similar performance as other builds.
I tried running decompress multiple times in the same process, but the speed is the same. Did I understood your comment wrong?