YAXArrays.jl
YAXArrays.jl copied to clipboard
Map of cubes with different chunkings can't be indexed by axis
When I apply a map over two cubes which have different internal chunkings which I make align by the setchunks function, I can do the computation but the subsetting into one of the axes fails with a ERROR: Chunks do not align in dimension 2
But the indexing into the underlying data works and it seems that the computation is correct.
I can't reproduce it yet with a toy example.
julia> wcorann_nan = map((x,y) -> isnan(x) ? NaN32 : y, setchunks(wcor, YAXArrays.Cubes.cubechunks(wcorann)), wcorann)
YAXArray with the following dimensions
Y Axis with 5820 Elements from 561609.15 to 736179.15
X Axis with 7500 Elements from 9.48724066e6 to 9.26227066e6
Polarisation Axis with 2 elements: VH VV
name: layer
Total size: 333.02 MB
julia> wcorann_nan[Y=(560000,570000)]
ERROR: Chunks do not align in dimension 2
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] (::DiskArrays.var"#73#77"{Int64, Int64})(i::Int64, ch::DiskArrays.GridChunks{3})
@ DiskArrays ~/.julia/packages/DiskArrays/bcti2/src/broadcast.jl:96
[3] #4
@ ./generator.jl:36 [inlined]
[4] iterate
@ ./generator.jl:47 [inlined]
[5] collect_to!
@ ./array.jl:845 [inlined]
[6] collect_to_with_first!
@ ./array.jl:823 [inlined]
[7] collect(itr::Base.Generator{Base.Iterators.Zip{Tuple{Vector{Int64}, Vector{DiskArrays.GridChunks{3}}}}, Base.var"#4#5"{DiskArrays.var"#73#77"{Int64, Int64}}})
@ Base ./array.jl:797
[8] map
@ ./abstractarray.jl:3055 [inlined]
[9] merge_chunks(csnow::Vector{DiskArrays.GridChunks{3}}, n::Int64)
@ DiskArrays ~/.julia/packages/DiskArrays/bcti2/src/broadcast.jl:94
[10] #63
@ ~/.julia/packages/DiskArrays/bcti2/src/broadcast.jl:72 [inlined]
[11] ntuple
@ ./ntuple.jl:19 [inlined]
[12] common_chunks(::Tuple{Int64, Int64, Int64}, ::Zarr.ZArray{Float32, 3, Zarr.BloscCompressor, Zarr.DirectoryStore}, ::Vararg{Zarr.ZArray{Float32, 3, Zarr.BloscCompressor, Zarr.DirectoryStore}})
@ DiskArrays ~/.julia/packages/DiskArrays/bcti2/src/broadcast.jl:63
[13] eachchunk
@ ~/.julia/packages/DiskArrays/bcti2/src/broadcast.jl:51 [inlined]
[14] eachchunk_view(#unused#::DiskArrays.Chunked, vv::SubArray{Float32, 3, DiskArrays.BroadcastDiskArray{Float32, 3, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{3}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}, var"#52#53", Tuple{Zarr.ZArray{Float32, 3, Zarr.BloscCompressor, Zarr.DirectoryStore}, Zarr.ZArray{Float32, 3, Zarr.BloscCompressor, Zarr.DirectoryStore}}}}, Tuple{UnitRange{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}, false})
@ DiskArrays ~/.julia/packages/DiskArrays/bcti2/src/subarray.jl:36
[15] eachchunk(a::DiskArrays.SubDiskArray{Float32, 3})
@ DiskArrays ~/.julia/packages/DiskArrays/bcti2/src/subarray.jl:32
[16] subsetcube(z::YAXArray{Float32, 3, DiskArrays.BroadcastDiskArray{Float32, 3, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{3}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}, var"#52#53", Tuple{Zarr.ZArray{Float32, 3, Zarr.BloscCompressor, Zarr.DirectoryStore}, Zarr.ZArray{Float32, 3, Zarr.BloscCompressor, Zarr.DirectoryStore}}}}, Vector{CubeAxis}}; kwargs::Base.Pairs{Symbol, Tuple{Int64, Int64}, Tuple{Symbol}, NamedTuple{(:Y,), Tuple{Tuple{Int64, Int64}}}})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/rQDCf/src/Cubes/Cubes.jl:260
[17] #getindex#22
@ ~/.julia/packages/YAXArrays/rQDCf/src/Cubes/Cubes.jl:316 [inlined]
[18] top-level scope
@ REPL[233]:1
Hereby the cubes are the following:
julia> wcor
YAXArray with the following dimensions
Y Axis with 5820 Elements from 561609.15 to 736179.15
X Axis with 7500 Elements from 9.48724066e6 to 9.26227066e6
Polarisation Axis with 2 elements: VH VV
name: layer
Total size: 333.02 MB
julia> wcorann
YAXArray with the following dimensions
Y Axis with 5820 Elements from 561609.15 to 736179.15
X Axis with 7500 Elements from 9.48724066e6 to 9.26227066e6
Polarisation Axis with 2 elements: VH VV
name: layer
Total size: 333.02 MB
This is not that easy to fix, because it would need a setup of a wrapper which would know the current chunk size and the targeted chunk size. For now we can use a mapCube with empty input and output dimensions when we do not depend on the laziness of the map call.
# This will be broken:
mapped = map((x,y) -> x * y, a1, setchunks(a2, YAXArrays.Cubes.cubechunks(a1)))
# This works but is not lazy:
mapcubed = mapCube((x,y) -> x[] * y[], (a1, setchunks(a2, YAXArrays.Cubes.cubechunks(a1))); indims=(InDims(),InDims()), outdims=OutDims(), inplace=false)