openwrt icon indicating copy to clipboard operation
openwrt copied to clipboard

image-commands: add gzip-libdeflate advanced compressor

Open plappermaul opened this issue 3 years ago • 21 comments

Several devices provide U-Boot versions with only gzip compressed kernel support (e.g. Realtek switches). To avoid going the hard way with lzma-loader we can make use of enhanced gzip tool based on libdeflate compression library from https://github.com/ebiggers/libdeflate. Output is fully gzip compatible. Add new tool to image-commands and make use of it for Realtek targets.

Image sizes before (gzip)

6402370 openwrt-realtek-rtl838x-d-link_dgs-1210-28-initramfs-kernel.bin
6030127 openwrt-realtek-rtl838x-d-link_dgs-1210-28-squashfs-sysupgrade.bin

Image sizes after (gzip-libdeflate)

6118634 openwrt-realtek-rtl838x-d-link_dgs-1210-28-initramfs-kernel.bin
5767983 openwrt-realtek-rtl838x-d-link_dgs-1210-28-squashfs-sysupgrade.bin

plappermaul avatar Jul 26 '22 07:07 plappermaul

Original PR https://github.com/openwrt/openwrt/pull/10162 got lost during repo sync. Restart it cleanly.

plappermaul avatar Jul 26 '22 07:07 plappermaul

please do not open new PR's, if you need help with git, or something weird happened to your commit just ask for help

mcprat avatar Jul 26 '22 10:07 mcprat

is the purpose of this tool just to save some space because it is compressed more?

mcprat avatar Jul 26 '22 11:07 mcprat

is the purpose of this tool just to save some space because it is compressed more?

So it is. The deflate algorithm offers some improvement beyond gzip level 9. This library just tries harder to compress data but keeps output deflate and therefore gzip compatible.

plappermaul avatar Jul 26 '22 11:07 plappermaul

is the purpose of this tool just to save some space because it is compressed more?

So it is. The deflate algorithm offers some improvement beyond gzip level 9. This library just tries harder to compress data but keeps output deflate and therefore gzip compatible.

so I would write the beginning of the commit message in a way that outlines the purpose of replacing gzip with this, that's not device-specific, just like how you told me here

mcprat avatar Jul 26 '22 11:07 mcprat

Comments have been updated

plappermaul avatar Jul 26 '22 11:07 plappermaul

in the first commit, only Makefile in tools are changed, so title should start tools:

mcprat avatar Jul 26 '22 22:07 mcprat

in the first commit, only Makefile in tools are changed, so title should start tools:

Done

plappermaul avatar Jul 27 '22 06:07 plappermaul

Anything else I can do to get this PR integrated?

plappermaul avatar Aug 12 '22 20:08 plappermaul

How much longer does the compression take with this patch? Does it create reproducible binaries?

hauke avatar Aug 13 '22 13:08 hauke

@plappermaul any update to @hauke's question? I think this is a great idea and getting it merged sounds like a great thing!

oliv3r avatar Sep 20 '22 09:09 oliv3r

In my opinion this is a no-brainer. As I'm currently only on initramfs during development I do not care with high priority.

plappermaul avatar Sep 20 '22 09:09 plappermaul

@plappermaul @hauke 's worried about reproduciability. E.g. if you ~compile the exact same thing twice~ (or compress the exact same thing twice is a more fair point) do you get exactly the same binary. If you can answer that question, I think it should be merged.

oliv3r avatar Sep 20 '22 18:09 oliv3r

As requested a small test with 60 MB of x86-64 linux executables concatenated into one large file.

[user@fedora libdeflate]$ time gzip -9 -c bla > bla1.gz
real	0m11,801s
user	0m11,305s
sys	0m0,081s
[user@fedora libdeflate]$ time gzip -9 -c bla > bla2.gz
real	0m11,884s
user	0m11,354s
sys	0m0,089s
[user@fedora libdeflate]$ diff bla1.gz bla2.gz 
[user@fedora libdeflate]$ time ./gzip_deflate -12 -c bla > bla3.gz
real	0m10,165s
user	0m9,843s
sys	0m0,070s
[user@fedora libdeflate]$ time ./gzip_deflate -12 -c bla > bla4.gz
real	0m10,076s
user	0m9,754s
sys	0m0,080s
[user@fedora libdeflate]$ diff bla3.gz bla4.gz 
[user@fedora libdeflate]$ ls -al bla*
-rw-r--r--. 1 user user 61133531 26. Sep 20:00 bla
-rw-r--r--. 1 user user 23988132 26. Sep 20:03 bla1.gz
-rw-r--r--. 1 user user 23988132 26. Sep 20:03 bla2.gz
-rw-r--r--. 1 user user 22986568 26. Sep 20:04 bla3.gz
-rw-r--r--. 1 user user 22986568 26. Sep 20:04 bla4.gz

To sum it up:

  • 5% better compression
  • 10% faster
  • no difference in output between two runs

plappermaul avatar Sep 26 '22 18:09 plappermaul

* no difference in output between two runs

Do you still have the binaries from that test? I think the question was 'what is the output of md5sum (or whatever) are they identical. I think they would be; but that's ultimately the concern raised. (Date/timestamps are always nice to deal with in reproducible builds though ...)

oliv3r avatar Oct 03 '22 08:10 oliv3r

that's why I ran the diff command. No output = same content.

plappermaul avatar Oct 03 '22 12:10 plappermaul

diff isn't perfect, a hash or cmp command is a little more convincing

mcprat avatar Oct 04 '22 21:10 mcprat

that's why I ran the diff command. No output = same content.

Fair enough, I didn't spot the diff, as I was naively scanning for checksums :)

I'm convinced enough; though I understand @mpratt14's comment.

LGTM!

oliv3r avatar Oct 06 '22 08:10 oliv3r

@plappermaul you need to rebase

@neheb @ynezz @aparcar @jow- since we change the default tool for gzip is this problematic to use a superior but custom tool for the task? I'm in favor for this and looks to have only advantages but wonder if you have additional comments about this.

Ansuel avatar Oct 13 '22 15:10 Ansuel

I assume this will get shot down. It adds another tool. People have been complaining about tools/ taking forever to compile.

The smaller size is nice though.

neheb avatar Oct 13 '22 15:10 neheb

@neheb don't know... i don't think compiling this gzip variant will add that much time and also if you are recompiling tools everytime you are probably an advanced dev or you are doing something wrong with your buildroot (like using make distclean for every compilation)

Also using compiled tool would remove any problem related to using host tools and make things more independent.

Ansuel avatar Oct 13 '22 15:10 Ansuel

rebased

plappermaul avatar Oct 27 '22 14:10 plappermaul