rustup icon indicating copy to clipboard operation
rustup copied to clipboard

delta compression for packages

Open llogiq opened this issue 9 years ago • 8 comments

Especially nightlies could benefit, both server&client-side.

llogiq avatar Aug 12 '16 21:08 llogiq

Stuck on a poor internet connection for the holiday, and I agree. xdelta3 is the best bindiff option that I know of.

spease avatar Nov 25 '17 21:11 spease

perhaps RFC 3229 could be used for this.

lolbinarycat avatar Sep 20 '24 21:09 lolbinarycat

did some testing with xdelta.

nightly-2024-06-10-x86_64-unknown-linux-gnu/bin/rustc is 2.6M
the 2024-09-19 nightly is also 2.6M

the delta between them is 525K (down to 487K when running with -9). that's about a 5x size reduction, pretty good!

keep in mind these are unideal conditions assuming no update in 3 months.

lolbinarycat avatar Sep 21 '24 20:09 lolbinarycat

it's worth noting that xdelta3 is just a piece of software, vcdiff is the underlying file format.

unfortunatly RFC 3229 doesn't have much in terms of software support, but it's simple enough it could probably be implemented mostly with middleware shims.

lolbinarycat avatar Sep 21 '24 20:09 lolbinarycat

I think this is an interesting idea and think its time may have come. I think the implementation in rustup could be fairly straightforward, but I think the majority of work here will be on the backend. As such, recommend starting with an issue against https://github.com/rust-lang/infra-team (and please mention the new issue here if you do so that we can coordinate).

djc avatar Sep 22 '24 08:09 djc

FWIW, I tried a little experiment. I computed a binary diff delta using xdelta, on the librustc_driver.so file, between Rust 1.80.0 and 1.81.0. The original file has ~125 MiB, 32 MiB with XZ compression. The delta file has ~91 MiB, 33 MiB with XZ compression. So in this case it does not seem to be worth it.

Between two consecutive nightlies (2024-10-11 and 2024-10-10), the delta was 66 MiB uncompressed and 26 MiB after XZ compression. That's better, but it still does not seem worth the complexity, tbh, at least for this specific compiler artifact.

Kobzol avatar Oct 12 '24 17:10 Kobzol

what if you did a diff of two uncompressed tarballs of the entire component? since currently we download whole components at a time.

lolbinarycat avatar Oct 12 '24 18:10 lolbinarycat

I only had a small script for a single file, I don't have much time to experiment with this right now, especially since the results so far have been underwhelming. But if you can do that experiment, I would be interested in the result.

Kobzol avatar Oct 13 '24 15:10 Kobzol