cef
cef copied to clipboard
Stop using .tar.bz, maybe??
Describe the bug The current releases use CEF_ARCHIVE_FORMAT set to tarbz. This is extremely slow to decompress. Bzip2 unpacks slower than xz and does not even compress better.
To Reproduce Steps to reproduce the behavior:
- Go to https://cef-builds.spotifycdn.com/index.html
- Click on a PDB download
- Wait
- Decomress
- WAIT, DRINK COFFEE
Expected behavior We could really use xz to get at least double the decompression speed. Or even zstd, at the cost of worse compression. These two are extremely widespread.
Screenshots
Versions (please complete the following information):
- OS: Windows 11, but really does not matter
Additional context
Python tarfile
has xz
support since 3.3. You don't even need to get an external program!
What is the size difference between xz and bz2 when creating archives using Python?
The compression method of the lzma
library is identical to xz
defaults (preset 6), according to the documentation. Knowing that, I decompressed cef_binary_113.1.4+g327635f+chromium-113.0.5672.63_windows64.tar.bz2
into the tar, then recompressed it with xz.
$ ls -l cef*
-rw-r--r-- 1 arthu arthu 825856000 May 12 14:17 cef_binary_113.1.4+g327635f+chromium-113.0.5672.63_windows64.tar
-rw-r--r-- 1 arthu arthu 275852077 May 12 14:17 cef_binary_113.1.4+g327635f+chromium-113.0.5672.63_windows64.tar.bz2
-rw-r--r-- 1 arthu arthu 201699604 May 12 14:17 cef_binary_113.1.4+g327635f+chromium-113.0.5672.63_windows64.tar.xz
Huh, much smaller. Decompression timing:
$ time bzip2 -d -c cef_binary_113.1.4+g327635f+chromium-113.0.5672.63_windows64.tar.bz2 >/dev/null
real 0m30.315s
user 0m15.593s
sys 0m0.187s
$ time xz -d -c cef_binary_113.1.4+g327635f+chromium-113.0.5672.63_windows64.tar.xz > /dev/null
real 0m12.301s
user 0m5.265s
sys 0m0.203s
And much faster.
While we are at it, the state of make_distrib.py
really isn't good. The else: create_7z_archive(dir, archive_format)
branch is dead code. Well, technically the entire 7z function is...
Then 7zip is simply better, which initially was used, but then rejected for some reason. .tar.bz or .tar.xz will always go slower, as it non-true "solid" archive which requires, depending on tools, to uncompress .bz and then extract .tar. I've expect acceptable interopability (as consumer) and this is not tar-variations on windows (i have no issues personally but generally is not ideal). Plain .zip is still winner in this sense. 7zip is right after it. Also 7zip used by chromium build so it should be on board at least for windows (it used for installer).
We also need to consider what comes default-installed on most OSes, and what is supported by common tools like CMake and TeamCity. Also related to issue #2446 (symlink support).
xz comes default installed on most OS. With tar it's always pure solid and has good encoding story (almost always UTF-8). The 2-level decompression is a result of how archive programs are designed on Windows: they are designed around showing file contents, instead of just a full streaming extraction. But since tar has no central directory, it takes a full decompression to show contents anyways. The point is, not tar's fault. See also https://github.com/M2Team/NanaZip/issues/138
7z has a stronger encoding story (mandatory UTF-16), option to be selectively solid, but two issues: pre-installation (partially solved by bsdtar
support) and symlink (uh-oh).
zip is a fragmented mess. No solid support, okay pre-installation. Symlink support is possible via Info-ZIP extension but does not seem to be present in Python zipfile.
To clarify, I'm not against .xz, it virtually same thing, so it provides also good compression ratio, which I'm welcomed.
Also, Windows 10 has tar(bsdtar) on board, but it again, virtually useless, as it have only gzip support. And because of this - 7zip is winner, as it anyway third-party tool.
Also, Windows 10 has tar(bsdtar) on board, but it again, virtually useless, as it have only gzip support. And because of this - 7zip is winner, as it anyway third-party tool.
First time hearing this! Interestingly, tar xf
a tar.bz2 works, so they have also put in bzip2 support. Since there's no bzip2.exe
in my PATH, it's probably compiled in via a library. Which is a bit of a surprise if you think about it, since they could've as easily linked to the public-domain liblzma too.
Ah you know what, let me throw something in the Feedback Hub. No idea if they read it.
@Artoria2e5 mine tar requires bzip2.exe and it doesnt work cause bzip2 absent. Windows 10 also includes curl. Nice, but it compiled without zlib/gzip support, so it cant download compressed deflate stream. And i'm anyway using standalone curl. Agreed what it is kind of strange. :)
Huh, Microsoft is now making the built-in bsdtar the basis of a new feature, it seems. https://www.bleepingcomputer.com/news/microsoft/windows-11-getting-native-support-for-7-zip-rar-and-gz-archives/
I got "working on it" tagged in the Feedback Hub, so they are putting some work in it.
https://github.com/chromiumembedded/cef/issues/3503#issue-1706183376
I support this.