zfs
zfs copied to clipboard
Considering to update ZSTD to 1.5.4
Describe the feature would like to see added to OpenZFS
The linux kernel has recently updated zstd to the 1.5.2 release. Some days ago the 1.5.4 got released with performance enhancement compared to 1.5.2.
How will this feature improve OpenZFS?
Faster compression
Additional context
Release Changelog: https://github.com/facebook/zstd/releases/tag/v1.5.4
Superseded by 1.5.5: https://github.com/facebook/zstd/releases/tag/v1.5.5
1.5.5 also have a corruption fix, not sure this impacts ZFS https://news.ycombinator.com/item?id=35446847
The corruption fix was for a bug introduced in zstd 1.5.0 (see zstd bug 3517), it seems that zfs uses zstd 1.4.5 with upstream patches ported by @ryao
zstd versioning was discussed in #12840 but nothing ever came of it.
As summarized in this https://github.com/openzfs/zfs/issues/12840#issuecomment-1059759178 updating the zstd version is more involved than one might guess. It's not something we're planning on doing until we have a very compelling reason. We'll of course continue to backport any critical upstream fixes to the zstd 1.4.5 version we're using.
@behlendorf I was curious if you knew how btrfs dealt with this. Btrfs uses kernel zstd, which have been updated multiple times in the last few years. How do they avoid the same issues that zfs face?
I know btrfs added the "filesystem defrag czstd:LEVEL" (tracked here: btrfs-progs/issues/184, not sure where in kernel mailing list). But documentation seems to suggest recursive recompression.
@behlendorf I was curious if you knew how btrfs dealt with this. Btrfs uses kernel zstd, which have been updated multiple times in the last few years. How do they avoid the same issues that zfs face?
I know btrfs added the "filesystem defrag czstd:LEVEL" (tracked here: btrfs-progs/issues/184, not sure where in kernel mailing list). But documentation seems to suggest recursive recompression.
ZFS also lets you specify the zstd level with zfs set compression=zstd-$LEVEL …. btrfs avoids the issues by not having either inline compression or a cache device. The issues are also the result of the interaction of other design decisions with those, such as checksums being for compressed data and the ability to cache uncompressed versions in memory while storing compressed versions in L2ARC that fit in the same space as the original compressed versions on disk.
If we somehow stored the uncompressed checksums and reworked L2ARC to fall back to storing uncompressed data when it would otherwise have an issue, we would not have a problem, but storing uncompressed checksums does not play nicely encryption. Simply storing checksums in plaintext is an information leak. Encryption typically has a minimum block size and anything above that is added, so 256-bit checksums might need more than 256 bits of storage. Then there is always the risk that the encrypted checksum and encrypted plaintext will be later found to enable decryption through some correlation that a researcher finds. That is not an area of active research as far as I know because no one does that, but if we begin doing it, cryptographers will likely start researching whether it can be done and if it can, we will have a serious problem that cannot be fixed without rewriting data, which is painful in multiple ways.