Add ZstdCompressor
Alternative to #149
This implementation supports multithreaded compression and decompression, and also supports the checksum option.
ChunkCodecLibZstd is being added as a direct dependency instead of a package extension, because Zarr.jl already depends on zstd through blosc.
One thing to note is that ChunkCodecLibZstd needs Julia ~~1.11~~ 1.10, and the ChunkCodec API is still experimental. Any suggestions for improving the API would be helpful.
Pull Request Test Coverage Report for Build 14317383070
Details
- 5 of 13 (38.46%) changed or added relevant lines in 1 file are covered.
- No unchanged relevant lines lost coverage.
- Overall coverage decreased (-0.6%) to 85.461%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| src/Compressors/zstd.jl | 5 | 13 | 38.46% |
| <!-- | Total: | 5 | 13 |
| Totals | |
|---|---|
| Change from base Build 13680422725: | -0.6% |
| Covered Lines: | 917 |
| Relevant Lines: | 1073 |
💛 - Coveralls
very much looking forward to both Zstd and the multithreading it brings. are there tests we could add to this PR to ensure it is thread safe?
#181 would add some basic round trip tests, which should cover all the code in this PR.
I'm not sure how to test if this is thread-safe, but in ChunkCodecLibZstd, there is no global state being mutated, and the underlying C library is supposed to be safe to use in multiple threads.
Since this requires Julia 1.11 anyways, could we make this into a package extension and an optional dependency instead of a hard dependency?
The main advantage for the merge strategy here is that we do not make Zarr.jl require Julia 1.11. I would at most be more comfortable making it require Julia 1.10.
I'm happy to accept a PR to ChunkCodecLibZstd.jl to support Julia 1.10. Currently, the only 1.11 feature I am using is the public keyword. But is there a need to install the latest version of an in-development package on an old version of Julia?
If the in-development package is "Zarr.jl", then yes. Julia 1.10 is the current long-term-support release, and I would expect upcoming releases of Zarr.jl to support Julia 1.10 for some time. Making "ChunkCodecLibZstd.jl" a mandatory dependency of Zarr.jl would prevent that. I am less concerned about support for Julia versions prior to Julia 1.10.
For "ChunkCodecLibZstd.jl", dependence on Julia 1.11 is less of an issue as long as it is only an optional dependency of Zarr.jl.
Compat.jl could be used to address the Julia version dependency. However, I still prefer codecs as optional dependencies when possible. If a convenience package, ZarrUniverse.jl for example, is needed that loads Zarr.jl and all optional dependencies, that would not be hard to accomodate.
I will send a pull request.
if it's just public then it's as simple as @compat public foo, bar instead of public foo, bar. unnecessarily restricting version compatibility is a p.i.t.a. please make this change!
PR for using Compat.jl for public: https://github.com/nhz2/ChunkCodecs.jl/pull/31
PR for making ChunkCodecLibZstd an optional dependency: https://github.com/JuliaIO/Zarr.jl/pull/183
I started to test the ZarrUniverse idea here: https://github.com/mkitti/Zarr.jl/tree/mkitti-zarr-universe/lib/ZarrUniverse
using Pkg
Pkg.add(url="https://github.com/mkitti/Zarr.jl", rev="mkitti-zarr-universe", subdir="lib/ZarrUniverse")
or
] add https://github.com/mkitti/Zarr.jl#mkitti-zarr-universe:lib/ZarrUniverse
I've updated the PR. It should work with Julia 1.10 now. Also, the new decode! function throws a DecodedSizeError if the decoded size is too small or large, which cleans up the error handling.
This should be thread safe because it creates a new context for each compression and decompression call.
I understand that @bjarthur has tested these changes under a multithreaded context. It would be great to see a test for this in the test suite.