distri
distri copied to clipboard
Use transparent zstd over HTTP for fetching and exporting packages
- [x] figure out how zstd over HTTP is typically done for maximum compatibility
- the client sends e.g. an
Accept-Encoding: gzip, deflate
header - the server replies with a
Content-Encoding: gzip
header -
zstd
is the value that can be used for zstandard as per RFC 8478, section 6.2- Upstream discussion at https://github.com/facebook/zstd/issues/315
- not yet documented e.g. on MDN
- the client sends e.g. an
- [x]
distri export
should support this- blocked on https://github.com/lpar/gzipped/pull/14 being merged
- [x]
distri
HTTP client should support asking for zstd versions - [x] zstd-compress all packages of the current release
- [x] ensure nginx behind repo.distr1.org serves zstd files when asked with the appropriate header, i.e. build https://github.com/tokers/zstd-nginx-module
- [ ] CloudFlare seems to strip
Accept-Encoding: zstd
right now: https://twitter.com/zekjur/status/1266855979041292290 - [x] recommend mirror operators support zstd
- [ ] In addition to accept-encoding, we could consider fetching the corresponding .zst file explicitly for squashfs files, perhaps in a happy eyeballs-like scheme. This increases support across the whole landscape, e.g. with mirrors that don’t support accept-encoding yet, or CDNs or proxies etc.
- [ ] we should add a test to verify that zstd is faster :)
This exploration shows how zstd beats both sequential gzip and parallel gzip significantly for decompressing larger packages such as qemu:
We took a few measurements with the respective uncompress tools on stream: https://youtu.be/dLr_6jJ4N7Y?t=4172
- gunzip: 2.95s (100%)
- unpigz: 1.50 (50%)
- unzstd: 0.51s (17%)
…and then measured distri package installation, which confirm zstd’s effectiveness in our use-case:
- qemu with gzip:
2020/05/30 17:24:30 done, 359.07 MB/s (1149903681 bytes in 3.05s)
- qemu with zstd:
2020/05/30 17:24:39 done, 603.67 MB/s (1149903681 bytes in 1.81s)