edit icon indicating copy to clipboard operation
edit copied to clipboard

Why tar.zst file over gzip compression in the Linux release?

Open MaurGi opened this issue 6 months ago • 17 comments

What is the purpose of using a tar.zst file over a tar.gz for the release?

Trying to install this in Azure Linux, but the unzstd compression is not there, making it hard to install, as I am not admin on the machine.

Given that this is a Linux binary, why not the standard tar.gz compression?

thx

MaurGi avatar Jul 09 '25 02:07 MaurGi

I use it because I like zstd and the distributions I use have support for it. I'll mark this up for discussion.

lhecker avatar Jul 09 '25 13:07 lhecker

Why not .tar.xz?

tar cf something.tar something/
xz -9e something.tar

usually gives the best compression. And why do we apply compression? Because we want to reduce the download size. And this combination does it the best way.

About availability in various distros: all modern popular Linux distros have xz preinstalled.

IvanPizhenko avatar Jul 10 '25 16:07 IvanPizhenko

I fully agree. The size of zstd package that you need to install to unpack the archive is literally two times larger than msedit itself.

Moreover, at least in case of msedit, gzip actually achieves better compression ratios. And while zstd does achieve slightly faster decompression times, the difference is negligible.

[me@host decompress-bench]$ ls -lh
total 224K
-rw-r--r-- 1 me me 115K Jul 10 20:26 edit-1.2.0-x86_64-linux-gnu.tar.zst
-rw-r--r-- 1 me me 108K Jul 10 20:26 edit.tgz
edit-1.2.0-x86_64-linux-gnu.tar.zst  edit.tgz
[me@host decompress-bench]$ time tar -xf edit-1.2.0-x86_64-linux-gnu.tar.zst

real    0m0.019s
user    0m0.007s
sys     0m0.011s
[me@host decompress-bench]$ rm edit
[me@hosr decompress-bench]$ time tar -xf edit.tgz

real    0m0.029s
user    0m0.003s
sys     0m0.022s


Leopold702 avatar Jul 10 '25 17:07 Leopold702

For what it's worth, Azure Linux does support zstd-compressed tar. It comes with bsdtar (from libarchive) as well as libzstd.

https://github.com/user-attachments/assets/199ad1ab-6541-48e8-bce4-3c6bf8d6fead

DHowett avatar Jul 10 '25 18:07 DHowett

Here's how different compression algorithms work for edit executable using -9 (max compression level) option:

-rwxr-xr-x 1 ivan ivan 222152 лип 10 21:37 edit
-rwxr-xr-x 1 ivan ivan 109545 лип 10 21:36 edit1.gz
-rwxr-xr-x 1 ivan ivan  95840 лип 10 21:37 edit2.xz
-rwxr-xr-x 1 ivan ivan 105676 лип 10 21:37 edit3.bz2
-rwxr-xr-x 1 ivan ivan 109736 лип 10 21:37 edit4.zst

So, as I said above (https://github.com/microsoft/edit/issues/571#issuecomment-3058069822), xz is the best one.

IvanPizhenko avatar Jul 10 '25 18:07 IvanPizhenko

I don't think that compression ratios are really so important here, considering that msedit itself is already very small. It's much more important that people could unpack the file on whatever system they are using, and gzip is the best option for this purpose, as it's by far the most common and preinstalled on virtually every distro.

Leopold702 avatar Jul 10 '25 20:07 Leopold702

With xz you get both - best compression and availability on virtually any modern distro.

IvanPizhenko avatar Jul 10 '25 22:07 IvanPizhenko

Yes, you can decompress xz files on any modern distro by default, unless it's an extremely cut down version. Even Windows has it!

itsKhalidHossain avatar Jul 11 '25 10:07 itsKhalidHossain

Do you really need to gain 10 MB (edit: 14 KB) in compression over having the best possible compatibility?

I think the goal of this file is to end up in more machine possible, performance in compression seems secondary to me. Just asking people to think what to use rather than just the plain old tar.gz seems unnecessary.

Especially because this tool is trying to connect two communities: the Windows folks used to Windows editors and the Linux community.

I think saving any MB in download space is not worth losing any user.

MaurGi avatar Jul 11 '25 17:07 MaurGi

For what it's worth, Azure Linux does support zstd-compressed tar. It comes with bsdtar (from libarchive) as well as libzstd.

WindowsTerminal_wSgBuKO6Z7.mp4

Thanks @DHowett - will use this.

MaurGi avatar Jul 11 '25 17:07 MaurGi

I think saving any MB in download space is not worth losing any user.

More like 14 KiB based on the comparison above. Agree though, +1 to just using tar.gz as its the quasi default and available everywhere and any savings in file size don't matter at all.

Consolatis avatar Jul 11 '25 17:07 Consolatis

I did not expect everybody to be so opinionated about compression formats. 😆

DHowett avatar Jul 11 '25 18:07 DHowett

Yes, this discussion has certainly run its course. The next release will use .tar.gz, because the package is small and .gz is the oldest / most widely used compression type.

lhecker avatar Jul 11 '25 21:07 lhecker

Current releases for Linux have only one file (edit) so tar-ing is redundant. Unless more files will be added in the future releases just compressing the file with gzip would make it work on any Linux distro.

zcobol avatar Jul 12 '25 18:07 zcobol

Current releases for Linux have only one file (edit) so tar-ing is redundant.

@zcobol Our first releases were like this. It confused people. They asked for it to be changed.

DHowett avatar Jul 12 '25 18:07 DHowett

Current releases for Linux have only one file (edit) so tar-ing is redundant. Unless more files will be added in the future releases just compressing the file with gzip would make it work on any Linux distro.

taring preserves file mode, saves you from additional chmod, while still unpacking with single command, i.e. tar xaf archive.tar.gz

IvanPizhenko avatar Jul 13 '25 11:07 IvanPizhenko

@zcobol It's not uncommon for various tools to block downloading of naked executables, and while it may be a bit irrational, I feel more comfortable downloading a tarball than an executable.

JakeSays avatar Jul 14 '25 14:07 JakeSays