YAXArrays.jl
YAXArrays.jl copied to clipboard
subsetting cube by Lon, lat and other variables fails
The following fails at several levels:
If I do the following
axlist = [
RangeAxis("other", 1:4),
RangeAxis("time", range(1, 20, length=20)),
RangeAxis("lon", range(-66.4598,-66.4598, length=1)),
RangeAxis("lat", range(-33.4648, -33.4648, length=1)),
]
data = rand(4, 20, 1, 1)
ds = YAXArray(axlist, data)
note that
ds.lat
1-element RangeAxis{Float64, :lat, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}:
-33.4648
and doing the following will fail:
subsetcube(ds, lon = -66.4598)
ERROR: InexactError: trunc(Int64, NaN)
Stacktrace:
[1] trunc
@ ./float.jl:805 [inlined]
[2] round
@ ./float.jl:369 [inlined]
[3] #axVal2Index#1
@ ~/.julia/packages/YAXArrays/swXW2/src/Cubes/Axes.jl:203 [inlined]
[4] interpretsubset(subexpr::Float64, ax::RangeAxis{Float64, :lon, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/swXW2/src/Cubes/Cubes.jl:269
[5] (::YAXArrays.Cubes.var"#14#18"{YAXArray{Float64, 4, Array{Float64, 4}, Vector{RangeAxis}}, Vector{Any}})(kw::Pair{Any, Any})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/swXW2/src/Cubes/Cubes.jl:297
[6] foreach(f::YAXArrays.Cubes.var"#14#18"{YAXArray{Float64, 4, Array{Float64, 4}, Vector{RangeAxis}}, Vector{Any}}, itr::Dict{Any, Any})
@ Base ./abstractarray.jl:2694
[7] _subsetcube(z::YAXArray{Float64, 4, Array{Float64, 4}, Vector{RangeAxis}}, subs::Vector{Any}; kwargs::Base.Pairs{Symbol, Float64, Tuple{Symbol}, NamedTuple{(:lon,), Tuple{Float64}}})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/swXW2/src/Cubes/Cubes.jl:289
[8] subsetcube(z::YAXArray{Float64, 4, Array{Float64, 4}, Vector{RangeAxis}}; kwargs::Base.Pairs{Symbol, Float64, Tuple{Symbol}, NamedTuple{(:lon,), Tuple{Float64}}})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/swXW2/src/Cubes/Cubes.jl:258
[9] top-level scope
the other issue is that doing
subsetcube(ds, other = (1,2))
YAXArray with the following dimensions
other Axis with 1 Elements from 1 to 1
time Axis with 20 Elements from 1.0 to 20.0
lon Axis with 1 Elements from -66.4598 to -66.4598
lat Axis with 1 Elements from -33.4648 to -33.4648
Total size: 160.0 bytes
will subset only the first element, but doing
subsetcube(ds, other = (1,3))
YAXArray with the following dimensions
other Axis with 3 Elements from 1 to 3
time Axis with 20 Elements from 1.0 to 20.0
lon Axis with 1 Elements from -66.4598 to -66.4598
lat Axis with 1 Elements from -33.4648 to -33.4648
Total size: 480.0 bytes
will subset 3 elements, and I cannot do the first 2 elements.
and doing
subsetcube(ds, other = (1,4))
will still give back still only 3 elements.
YAXArray with the following dimensions
other Axis with 3 Elements from 1 to 3
time Axis with 20 Elements from 1.0 to 20.0
lon Axis with 1 Elements from -66.4598 to -66.4598
lat Axis with 1 Elements from -33.4648 to -33.4648
Total size: 480.0 bytes
As for the first issue, of course doing:
axlist = [
RangeAxis("other", 1:4),
RangeAxis("time", range(1, 20, length=20)),
RangeAxis("lon", [-66.4598]),
RangeAxis("lat", [-33.4648]),
]
data = rand(4, 20, 1, 1)
ds = YAXArray(axlist, data)
ds.lon
subsetcube(ds, lon = -66.4598, lat = -33.4648)
works, however sometimes we don't have control over the YAX generation and having the type StepRangeLen will be unavoidable.
YAXArray with the following dimensions
other Axis with 4 Elements from 1 to 4
time Axis with 20 Elements from 1.0 to 20.0
Total size: 640.0 bytes
This is the error message for subsetting a Timeaxis:
ERROR: MethodError: no method matching convert_time(::Type{DateTime}, ::Vector{DateTime})
Closest candidates are:
convert_time(::Type{<:TimeType}, ::Date) at ~/.julia/packages/YAXArrays/Ly8sc/src/Cubes/Axes.jl:281
convert_time(::Type{<:TimeType}, ::TimeType) at ~/.julia/packages/YAXArrays/Ly8sc/src/Cubes/Axes.jl:279
Stacktrace:
[1] axVal2Index(a::RangeAxis{DateTime, :time, Vector{DateTime}}, v::Vector{DateTime}; fuzzy::Bool)
@ YAXArrays.Cubes.Axes ~/.julia/packages/YAXArrays/Ly8sc/src/Cubes/Axes.jl:197
[2] interpretsubset(subexpr::Vector{DateTime}, ax::RangeAxis{DateTime, :time, Vector{DateTime}})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/Ly8sc/src/Cubes/Cubes.jl:234
[3] (::YAXArrays.Cubes.var"#14#18"{YAXArray{Float32, 4, DiskArrayTools.DiskArrayStack{Float32, 4, DiskArrays.SubDiskArray{Float32, 3}, 1}, Vector{CubeAxis}}, Vector{Any}})(kw::Pair{Any, Any})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/Ly8sc/src/Cubes/Cubes.jl:262
[4] foreach(f::YAXArrays.Cubes.var"#14#18"{YAXArray{Float32, 4, DiskArrayTools.DiskArrayStack{Float32, 4, DiskArrays.SubDiskArray{Float32, 3}, 1}, Vector{CubeAxis}}, Vector{Any}}, itr::Dict{Any, Any})
@ Base ./abstractarray.jl:2694
[5] _subsetcube(z::YAXArray{Float32, 4, DiskArrayTools.DiskArrayStack{Float32, 4, DiskArrays.SubDiskArray{Float32, 3}, 1}, Vector{CubeAxis}}, subs::Vector{Any}; kwargs::Base.Pairs{Symbol, Vector{DateTime}, Tuple{Symbol}, NamedTuple{(:time,), Tuple{Vector{DateTime}}}})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/Ly8sc/src/Cubes/Cubes.jl:254
[6] subsetcube(z::YAXArray{Float32, 4, DiskArrayTools.DiskArrayStack{Float32, 4, DiskArrays.SubDiskArray{Float32, 3}, 1}, Vector{CubeAxis}}; kwargs::Base.Pairs{Symbol, Vector{DateTime}, Tuple{Symbol}, NamedTuple{(:time,), Tuple{Vector{DateTime}}}})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/Ly8sc/src/Cubes/Cubes.jl:223
[7] #getindex#22
@ ~/.julia/packages/YAXArrays/Ly8sc/src/Cubes/Cubes.jl:281 [inlined]
[8] top-level scope
@ REPL[154]:1
As discussed, using DimensionalData.jl solves the issue of indexing.
using DimensionalData
axlist = [
RangeAxis("other", 1:4),
RangeAxis("time", range(1, 20, length=20)),
RangeAxis("lon", range(-66.4598,-66.4598, length=1)),
RangeAxis("lat", range(-33.4648, -33.4648, length=1)),
]
data = rand(4, 20, 1, 1)
ds = YAXArray(axlist, data)
dim = yaxconvert(DimArray, ds)
subset = dim[lon=At(-66.4598), lat = At(-33.4648)]
yax = yaxconvert(YAXArray, subset)
YAXArray with the following dimensions
other Axis with 4 Elements from 1 to 4
time Axis with 20 Elements from 1.0 to 20.0
Total size: 640.0 bytes
Thanks a lot @lazarusA. Your solution works like a charm!
is also in the docs hopefully soon will be part of the package, so we don't need to do those extra steps.
Ok, I just found out that subset = dim[lon=At(-66.4598), lat = At(-33.4648)] is not lazy. This is a problem when working with a huge dataset as the Earth System Data Cube.