zlib
zlib copied to clipboard
Add support for IBM Z hardware-accelerated deflate
Note: this PR is based on https://github.com/madler/zlib/pull/750 and https://github.com/iii-i/zlib/releases/tag/crc32vx-v6 in order to simplify integration into distributions, which normally want all three changes.
IBM Z mainframes starting from version z15 provide DFLTCC instruction, which implements deflate algorithm in hardware with estimated compression and decompression performance orders of magnitude faster than the current zlib and ratio comparable with that of level 1.
This patch adds DFLTCC support to zlib. It can be enabled using the following build commands:
$ ./configure --dfltcc
$ make
When built like this, zlib would compress in hardware on level 1, and
in software on all other levels. Decompression will always happen in
hardware. In order to enable DFLTCC compression for levels 1-6 (i.e.,
to make it used by default) one could either configure with
--dfltcc-level-mask=0x7e or export DFLTCC_LEVEL_MASK=0x7e at run
time.
Two DFLTCC compression calls produce the same results only when they
both are made on machines of the same generation, and when the
respective buffers have the same offset relative to the start of the
page. Therefore care should be taken when using hardware compression
when reproducible results are desired. One such use case - reproducible
software builds - is handled explicitly: when the SOURCE_DATE_EPOCH
environment variable is set, the hardware compression is disabled.
DFLTCC does not support every single zlib feature, in particular:
* `inflate(Z_BLOCK)` and `inflate(Z_TREES)`
* `inflateMark()`
* `inflatePrime()`
* `inflateSyncPoint()`
When used, these functions will either switch to software, or, in case this is not possible, gracefully fail.
This patch tries to add DFLTCC support in the least intrusive way. All SystemZ-specific code is placed into a separate file, but unfortunately there is still a noticeable amount of changes in the main zlib code. Below is the summary of these changes.
DFLTCC takes as arguments a parameter block, an input buffer, an output
buffer and a window. Since DFLTCC requires parameter block to be
doubleword-aligned, and it's reasonable to allocate it alongside
deflate and inflate states, The ZALLOC_STATE(), ZFREE_STATE() and
ZCOPY_STATE() macros are introduced in order to encapsulate the
allocation details. The same is true for window, for which
the ZALLOC_WINDOW() and TRY_FREE_WINDOW() macros are introduced.
Software and hardware window formats do not match, therefore,
deflateSetDictionary(), deflateGetDictionary(),
inflateSetDictionary() and inflateGetDictionary() need special
handling, which is triggered using the new
DEFLATE_SET_DICTIONARY_HOOK(), DEFLATE_GET_DICTIONARY_HOOK(),
INFLATE_SET_DICTIONARY_HOOK() and INFLATE_GET_DICTIONARY_HOOK()
macros.
deflateResetKeep() and inflateResetKeep() now update the DFLTCC
parameter block, which is allocated alongside zlib state, using
the new DEFLATE_RESET_KEEP_HOOK() and INFLATE_RESET_KEEP_HOOK()
macros.
The new DEFLATE_PARAMS_HOOK() macro switches between the hardware
and the software deflate implementations when the deflateParams()
arguments demand this.
The new INFLATE_PRIME_HOOK(), INFLATE_MARK_HOOK() and
INFLATE_SYNC_POINT_HOOK() macros make the respective unsupported
calls gracefully fail.
The algorithm implemented in the hardware has different compression
ratio than the one implemented in software. In order for
deflateBound() to return the correct results for the hardware
implementation, the new DEFLATE_BOUND_ADJUST_COMPLEN() and
DEFLATE_NEED_CONSERVATIVE_BOUND() macros are introduced.
Actual compression and decompression are handled by the new
DEFLATE_HOOK() and INFLATE_TYPEDO_HOOK() macros. Since inflation
with DFLTCC manages the window on its own, calling updatewindow() is
suppressed using the new INFLATE_NEED_UPDATEWINDOW() macro.
In addition to the compression, DFLTCC computes the CRC-32 and Adler-32
checksums, therefore, whenever it's used, the software checksumming is
suppressed using the new DEFLATE_NEED_CHECKSUM() and
INFLATE_NEED_CHECKSUM() macros.
DFLTCC will refuse to write an End-of-block Symbol if there is no input
data, thus in some cases it is necessary to do this manually. In order
to achieve this, send_bits(), bi_reverse(), bi_windup() and
flush_pending() are promoted from local to ZLIB_INTERNAL.
Furthermore, since the block and the stream termination must be handled
in software as well, enum block_state is moved to deflate.h.
Since the first call to dfltcc_inflate() already needs the window,
and it might be not allocated yet, inflate_ensure_window() is
factored out of updatewindow() and made ZLIB_INTERNAL.
Gentle ping.
- Fixed a bug when
Z_SYNC_FLUSHusage led to incomplete EOBS write. - Fixed a "goto fail" bug in
dfltcc_deflate_get_dictionary(). - Replaced
getenv()withsecure_getenv(). - Added
sys/sdt.hfeature test.
- Fixed 31-bit build:
- Added machine mode hint for STFLE.
- Adjusted offset calculations.
- Fixed
sys/sdt.hfeature test. - Added an entry to
contrib/README.contrib.
Only issue with this PR is that looks like it is is controlled by CFLSGS injection instead proper autoconf ---with{,oout}-foo.
- Added support for switching between software and hardware compression.
- Added
--dfltccconfigure flag (the old way of building it still works).
- Fix missing EOBS in raw streams.
- Parse environment variables and facility bits only once.
- Fix a number of issues related to switching compression levels.
- Fix building with clang.
- Fix sys/sdt.h detection.
- Make
inflateSyncPoint()gracefully fail instead of returning an incorrect result.
- Add
--dfltcc-level-mask=...toconfigureas a shorthand forCFLAGS=-DDFLTCC_LEVEL_MASK=.... - Add a short README.
- Fix
compressBound().
Is there any update on merging this PR? This is already used in production systems and there has been no bug reports so far.
Hello Ilya, Zlib has released a new 1.2.12 version a few days ago, and this PR starts to have a lot of conflicts.
Is it possible for you to check and correct them? They are way out of our scope and we could really use your help in this because you understand it the best.
This new release brings a new CVE which has to be fixed ASAP, so could you give me an estimate on how long it could take you to fix these conflicts?
Thank you for your effort.
@ljavorsk I have a preview version here: https://github.com/iii-i/zlib/releases/tag/dfltcc-20220405 So far it's looking good, but I need to run more tests and also fuzz it a little. Hopefully I'll be done by the end of the day.
@iii-i Great news thank you :)
Will you update this PR or create a new one with the fixed patches after you declare it's ready?
Yes, that's the plan.
- Rebased.
- Fix updating strm.adler with inflate().
Hi @iii-i ,
Could you please also rebase the patch on top of the new zlib-1.2.13 version?
Thank you so much :)
- Rebased on top of zlib 1.2.13.
@madler: What do you think?
- Removed unused
endptrvariable ininit_globals().
- Support inflate with small window.
- Do not update strm.adler for raw streams (https://bugzilla.redhat.com/show_bug.cgi?id=2155328).
Could you please rebase this patch on top of the https://github.com/madler/zlib/pull/750 PR as we agreed?
There are a few conflicting changes and the patch doesn't apply cleanly, so I would have to rewrite it every single time you rebase it.
Thank you so much @iii-i :)
- Rebased on top of devel and https://github.com/madler/zlib/pull/750.
Hello! I am zlib maintainer at SUSE and we are experiencing crashes on Firefox on s390x machines. After some investigation, we have find out that they are related to zlib, possibly incorrect usage from Firefox. However, since it is working with the bundled lib and the other architectures, I would also consider this a bug in the hardware accelerated implementation of deflate.
This is the bug with the description of the issue: https://bugzilla.suse.com/show_bug.cgi?id=1210593
- Fix
deflateBound()beforedeflateInit().
Hi,
The change called "add DFLTCC support for using inflate() with a small window" (diff: https://github.com/madler/zlib/compare/26f2c0a4e17e5558d779797d713aa37ebaeef390..28a49d89b4329ca349e1b10e860e6b3f7ebbfa9e) has brought one COMPILER_WARNING zlib-1.2.11/inflate.c:1507:14: warning[-Wunused-variable]: unused variable 'wsize'
This could be easily fixed by removing the variable.
- Remove an unused variable.