Zarr.jl icon indicating copy to clipboard operation
Zarr.jl copied to clipboard

`zlib` is sometimes used as a filter

Open asinghvi17 opened this issue 1 year ago • 2 comments

I recently had this issue while writing FSSpec.jl:

 Zarr.zopen(st #= ::FSStore <: Zarr.AbstractStore =#)
ERROR: KeyError: key "zlib" not found
Stacktrace:
  [1] getindex(h::Dict{String, Type{<:Zarr.Filter}}, key::String)
    @ Base ./dict.jl:498
  [2] (::Zarr.var"#60#61")(f::Dict{String, Any})
    @ Zarr ~/.julia/dev/kerchunk-project/Zarr/src/Filters/Filters.jl:75
  [3] iterate
    @ ./generator.jl:47 [inlined]
  [4] collect_to!(dest::Vector{Zarr.ShuffleFilter}, itr::Base.Generator{Vector{…}, Zarr.var"#60#61"}, offs::Int64, st::Int64)
    @ Base ./array.jl:892
  [5] collect_to_with_first!(dest::Vector{…}, v1::Zarr.ShuffleFilter, itr::Base.Generator{…}, st::Int64)
    @ Base ./array.jl:870
  [6] _collect(c::Vector{Any}, itr::Base.Generator{Vector{…}, Zarr.var"#60#61"}, ::Base.EltypeUnknown, isz::Base.HasShape{1})
    @ Base ./array.jl:864
  [7] collect_similar(cont::Vector{Any}, itr::Base.Generator{Vector{Any}, Zarr.var"#60#61"})
    @ Base ./array.jl:763
  [8] map(f::Function, A::Vector{Any})
    @ Base ./abstractarray.jl:3285
  [9] getfilters(d::Dict{String, Any})
    @ Zarr ~/.julia/dev/kerchunk-project/Zarr/src/Filters/Filters.jl:74

it looks like people are using zlib as a Zarr filter too.

Is it time to factor out the compressors and filters to a single API and make a NumericCodecs.jl?

asinghvi17 avatar Aug 28 '24 03:08 asinghvi17

Good question. Starting from zarrv3 there will be no difference anymore between compressors and filters and there will only be a codec pipeline. I have no idea how well this can already be transferred to the v2 model, but maybe it would be easier for now to add a zlib Filter type with common functionality to the compressor?

meggart avatar Aug 29 '24 11:08 meggart

Since I'm adding more filters anyway, I'm wondering if it makes sense to refactor everything now into a codec? At least for the zcompress/zencode / zuncompress/zdecode, we could theoretically just merge *compress into *encode. That would probably also make things easier for Zarr v3.

As a monkeypatch I can implement your suggestion though, it's just that it might make more sense to unify if Zarr v3 is coming up anyway.

asinghvi17 avatar Aug 29 '24 17:08 asinghvi17