Use bz2 or lzma for python >= 3.3 ?
Hi,
- Just an idea : now that many python projects are ditching python2.7, and mostly support only
python >= 3.4, if a wheel is only for python3.X, it could reduce the size by usingbz2orlzma.
Typical example : numpy-1.15.0-cp37-cp37m-manylinux1_x86_64.whl (size: 13845063 bytes), only for python 3.7, could gain 50% by using lzma (size: 6899142 bytes)..
- A simpler idea if keeping
zlib, starting withpython 3.7,zipfileexposes thecompresslevelparam. When wheels are generated frompython 3.7, it could also help to setcompresslevelto a value higher than 6.
Thanks !
Certainly not by default on any Python, and PyPI may consider refusing such wheels.
The second idea sounds a bit safer. Do you have numbers for this?
Hi @agronholm
The compresslevel with zlib is less interesting:
For numpy-1.15.2-cp27-cp27mu-manylinux1_x86_64.whl,
- uncompressed size: 51928 kibibytes
- wheel available on Pypi: 13510.78 kibibytes
- rebuilt wheel with
compresslevel=9: 13380.84 kibibytes (0.96% better)
I combined the compresslevel parameter with the choice to store files that would result in a bigger compressed file as Stored, not Deflate.
Wheels are zipfiles which are compressed per-file, and the zip metadata (filenames) is not compressed. If you were to create the zip file with no compression "store" and then lzma the whole thing you would see better results. Compression algorithms work better on large inputs. Convincing others to accept those wheels would be the tricky part.
Yeah, there are bound to be practical difficulties with compression other than zlib.
I like the idea of improving compression, does it make sense to only use zlib forever, but you'd have to update more tools than just bdist_wheel to pull it off. Another zipfile compression trick is to put one "stored" zipfile inside another one.
I've done some groundwork for this in PR #316. Once I am sure that PyPI will reject bzip2/lzma based wheels, I can add support for other compression algorithms as well.
Cool. It's sortof good and probably helpful for the few wheels that have big individual files.
I've done some groundwork for this in PR #316. Once I am sure that PyPI will reject bzip2/lzma based wheels, I can add support for other compression algorithms as well.
@agronholm, will PyPI reject bzip2/lzma? may I know the reason? Thanks.
I am trying to reduce binary size for our wheel, and LZMA shows good potential. It will be great if wheel and pypi can support it.
Those are not supported on Python 2, and thus historically wheels did not support it. Going forward, the plan seems to be to have two layers in the zip where the actual content is xz compressed, making it even more efficient.
Thanks for the quick response.
But if my package only targets for Python 3, will there be any blocking issue to upload my wheel to PyPI and let user install my package via pip3?
I honestly don't know. I know it's not supported or recommended.
Got it, thanks.