Zarr.jl icon indicating copy to clipboard operation
Zarr.jl copied to clipboard

Zarr from gdalwarp can not be read because of fill_value is Nothing

Open felixcremer opened this issue 10 months ago • 4 comments

I converted a tif file using gdalwarp -of Zarr path.tif path.zarr This gives a .zarr file and I can open it with Zarr.jl but when I try to read the data it fails with the following error while filling the array with the fill value which is Nothing. The tif file and the converted Zarr file is attached to the issue. I am not sure, whether this is a Zarr.jl or a GDAL bug.

julia> z = zopen(path)
ZarrGroup at DirectoryStore("test/data/cea.zarr") and path 
Variables: Y X cea 

julia> z.arrays
Dict{String, ZArray} with 3 entries:
  "Y"   => ZArray{Float64} of size 515
  "X"   => ZArray{Float64} of size 514
  "cea" => ZArray{UInt8} of size 514 x 515

julia> using ZarrDatasets^C

julia> z["cea"][:,:]
ERROR: MethodError: Cannot `convert` an object of type Nothing to an object of type UInt8

Closest candidates are:
  convert(::Type{T}, ::EzXML.NodeType) where T<:Integer
   @ EzXML ~/.julia/packages/EzXML/DL8na/src/node.jl:36
  convert(::Type{T}, ::EzXML.ReaderType) where T<:Integer
   @ EzXML ~/.julia/packages/EzXML/DL8na/src/streamreader.jl:59
  convert(::Type{T}, ::Number) where T<:Number
   @ Base number.jl:7
  ...

Stacktrace:
 [1] fill!(dest::Matrix{UInt8}, x::Nothing)
   @ Base ./array.jl:393
 [2] uncompress_raw!
   @ ~/.julia/packages/Zarr/uTqS1/src/ZArray.jl:259 [inlined]
 [3] uncompress_to_output!(aout::Matrix{…}, output_base_offsets::Tuple{…}, z::ZArray{…}, chunk_compressed::Nothing, current_chunk_offsets::Tuple{…}, a::Matrix{…}, indranges::Tuple{…})
   @ Zarr ~/.julia/packages/Zarr/uTqS1/src/ZArray.jl:270
 [4] readblock!(aout::Matrix{…}, z::ZArray{…}, r::CartesianIndices{…})
   @ Zarr ~/.julia/packages/Zarr/uTqS1/src/ZArray.jl:178
 [5] readblock!(::ZArray{…}, ::Matrix{…}, ::Base.OneTo{…}, ::Vararg{…})
   @ Zarr ~/.julia/packages/Zarr/uTqS1/src/ZArray.jl:247
 [6] getindex_disk(::ZArray{UInt8, 2, Zarr.NoCompressor, DirectoryStore}, ::Function, ::Vararg{Function})
   @ DiskArrays ~/.julia/packages/DiskArrays/bZBJE/src/diskarray.jl:40
 [7] getindex(::ZArray{UInt8, 2, Zarr.NoCompressor, DirectoryStore}, ::Function, ::Function)
   @ DiskArrays ~/.julia/packages/DiskArrays/bZBJE/src/diskarray.jl:211
 [8] top-level scope
   @ REPL[24]:1
Some type information was truncated. Use `show(err)` to see complete types.

felixcremer avatar Apr 25 '24 15:04 felixcremer

cea.zip This is the tif and zarr file that is failing.

felixcremer avatar Apr 25 '24 15:04 felixcremer

This error is caused when there are missing chunks and no fill value is defined, which means that there is simply undefined data. I know that this does not error when you try to read the slice in zarr-python, but what you get is random data, usually just what has been in the buffer before. For Julia I would still strongly recommend to throw an error in this case, but agree that the error message can be improved.

I would say this is a GDAL bug, because when no fill value is defined then all chunks need to exist for a Zarr array to be valid.

meggart avatar Apr 26 '24 07:04 meggart

Yes reading random data sounds bad. Could we enable setting the fill_value from inside Julia or might this also be problematic?

I open a PR for the error.

felixcremer avatar Apr 26 '24 07:04 felixcremer

I realized working on this, that we can also build a broken Zarr file with chunks that are not there and no fill_value. Should we at least throw a warning when we create a file that has this setup?

felixcremer avatar Apr 26 '24 08:04 felixcremer