mixer-tools
mixer-tools copied to clipboard
Explore options on zstd for compression performance
Either from a CPU or memory perspective, let's explore if using the options for zstd affects compression performance to see if it can get better than xz by a degree enough to switch to zstd by default.
I explored various flag of zstd like fast, adapt, ultra and tried different levels. The size vs time trade off is not worth to switch zstd as default .
@ashleshaAtrey I'm interested in if you still have those numbers around. It might be something that is worth exposing as a configuration option to the user. Some may find the trade-off to be worth it depending on the particular content that they are providing as eventually this would be used by 3rd-party bundles as well.
@bryteise It is already exposed. You can use it by setting COMPRESSION = ["external-zstd"]
in builder.conf
Total size before compression 43184317440 Zstd Total size after compression 16528392879 CREATE FULLFILES 13m33.7s
zstd --fast=22 Total size after compression 25457700882 CREATE FULLFILES 13m19.447s
Zstd --adapt Total size after compression 16354637680 CREATE FULLFILES 13m25.032s
Zstd --fast Total size after compression 19045817625 CREATE FULLFILES 13m11.38s
Zstd --ultra Total size after compression 16380224911 CREATE FULLFILES 13m22.015s
Do you have the numbers for the other compression methods as well (xz, bzip2, gzip)?
Xz: Total size before compression 43183234560 Total size after compression 13877547116 CREATE FULLFILES 14m55.218s
bzip2 Total size before compression 43183935488 Total size after compression 16014229591 CREATE FULLFILES 12m18.674s
gzip Total size before compression 43182304256 Total size after compression 16910068787 CREATE FULLFILES 11m54.421s
I think that decompression speed is also worth cross-comparing, because xz tends to be slower than zstd in this area, and swupd has to decompress many files in course of its operation.
Also note Arjan's old blog post where he conducted a detailed cross-comparison for compression types. It would be awesome to produce a followup to that post with the latest findings.
I ran decompression tests on 906752 fullfiles using the tar utility,
For XZ compression, time took to decompress: 11134 seconds For Zstd compression, time took to decompress: 19767 seconds To decompress optimal size( mix of XZ or Zstd files): 12991 seconds
Next, I will work on finding the stats for memory used while compressing and decompressing those files.
Memory used while creating fullfiles: Alloc is bytes of allocated heap objects. TotalAlloc increases as heap objects are allocated, but unlike Alloc and HeapAlloc, it does not decrease when objects are freed Sys is the total bytes of memory obtained from the OS
Zstd compression
Alloc = 6493 MiB TotalAlloc = 49925 MiB Sys = 9213 MiB
xz compression
Alloc = 5850 MiB TotalAlloc = 49926 MiB Sys = 9344 MiB
Interesting zstd is taking longer to decompress (and by a fairly significant amount too). That's really surprising.
@ashleshaAtrey Can you review if the numbers and units look correct.
compression time (minutes) | decompression time (minutes) | memory usage (MiB) | compression size (Bytes) | |
---|---|---|---|---|
xz | 14.91666667 | 185.5666667 | 9344 | 13877547116 |
zstd | 13.55 | 329.45 | 9213 | 16528392879 |
I am very surprised to see zstd
have such a large decompression time. In my experience, decompression time for zstd
has been dramatically faster than xz
...
Thinking about this a little more.
I wonder if there is a set of files which take disproportionately longer to decompress (but there are few of them) or files that take slightly longer to decompress (but there are many of them)?
Figuring this out would be enlightening. Do you have file by file time differences?
I will work on finding file by file time difference, dont have those stats handy.