YAXArrays.jl
YAXArrays.jl copied to clipboard
Error in infering input dimensions during mapCube of a Dataset
I want to call mapCube on all Variables of a Dataset within the same Zarr store at once, e.g. converting bands red, green, and blue in parallel. One can apply mapCube on each Array separately. However, they share some input and output dimensions so that I want to put them into the same Zarr Dataset store, writing the data directly to outdims while skipping additional copying of savedataset.
Unfortunately,
using YAXArrays
using DimensionalData
a = rand(X(1:10), Y(1:5)) |> x -> YAXArray(x.data)
b = rand(X(1:10), Y(1:5)) |> x -> YAXArray(x.data)
ds = Dataset(a=a, b=b)
res = mapCube(
ds;
indims=(InDims(), InDims()),
outdims=OutDims(Ti(1:10); path=tempname(), backend=:zarr),
) do xin, xout
xout .= 42
end
results into error:
ERROR: type Tuple has no field axisdesc
Stacktrace:
[1] getproperty
@ ./Base.jl:49 [inlined]
[2] mapCube(::Function, ::Dataset; indims::Tuple{…}, outdims::OutDims, inplace::Bool, kwargs::@Kwargs{})
@ YAXArrays.DAT ~/prj/YAXArrays.jl/src/DAT/DAT.jl:339
[3] top-level scope
@ REPL[12]:1
Notably, we get the same error after converting the Dataset into a tuple of YAXArrays:
using YAXArrays, Zarr
using YAXArrays: YAXArrays as YAX
using Dates
f(lo, la, t) = (lo + la + Dates.dayofyear(t))
function g(xout, lo, la, t)
xout .= f.(lo, la, t)
end
lat_yax = YAXArray(lat(range(1, 10)))
lon_yax = YAXArray(lon(range(1, 15)))
tspan = Date("2022-01-01"):Day(1):Date("2022-01-30")
time_yax = YAXArray(YAX.time(tspan))
gen_cube = mapCube(g, (lon_yax, lat_yax, time_yax);
indims = (InDims(), InDims(), InDims("time")),
outdims = OutDims("time", overwrite=true, path="my_gen_cube.zarr", backend=:zarr,
outtype = Float32)
)
ds_t = Dataset(; r = lat_yax, g = lon_yax, t = time_yax )
gen_cube_ds = mapCube(g, ds_t;
indims = (InDims(), InDims(), InDims("time")),
outdims = OutDims("time", overwrite=true, path="my_gen_cube.zarr", backend=:zarr,
outtype = Float32)
)
The corresponding method does not have unit tests.
Workaround
Create and save skeleton of dataset and fill it later with set index in parallel see YAXArrays and xarrays documentation.