YAXArrays.jl
YAXArrays.jl copied to clipboard
yaxarrays slower than dimensionaldata
it's slower when converting yax to dd:
julia> using YAXArrays, YAXArrayBase, DimensionalData, BenchmarkTools
julia> yax = YAXArray(rand(10, 20, 5));
julia> dd = yaxconvert(DimArray, yax);
julia> @benchmark yax[Dim_1=1:3]
BenchmarkTools.Trial: 10000 samples with 7 evaluations.
Range (min … max): 4.059 μs … 190.583 μs ┊ GC (min … max): 0.00% … 95.98%
Time (median): 4.137 μs ┊ GC (median): 0.00%
Time (mean ± σ): 4.303 μs ± 4.106 μs ┊ GC (mean ± σ): 2.28% ± 2.34%
▂▇██▇▅▃▁ ▁ ▂
█████████▆▆▄▄▅▃▃▂▃▂▂▅▇████████▇▇▆▅▅▅▄▅▄▅▄▃▄▆▅▆▆▆▆▇▆▄▄▅▅▄▄▅▆ █
4.06 μs Histogram: log(frequency) by time 5.5 μs <
Memory estimate: 4.88 KiB, allocs estimate: 87.
julia> @benchmark dd[Dim_1=1:3]
BenchmarkTools.Trial: 10000 samples with 313 evaluations.
Range (min … max): 269.834 ns … 8.516 μs ┊ GC (min … max): 0.00% … 96.06%
Time (median): 369.808 ns ┊ GC (median): 0.00%
Time (mean ± σ): 489.908 ns ± 878.050 ns ┊ GC (mean ± σ): 24.04% ± 12.50%
█▇ ▁
███▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆██ █
270 ns Histogram: log(frequency) by time 6.9 μs <
Memory estimate: 2.59 KiB, allocs estimate: 2.
as well as converting dd to yax:
julia> DD = DimArray(rand(50, 31), (X(), Y(10.0:40.0)), metadata = Dict{String, Any}());
julia> YAX = yaxconvert(YAXArray, DD)
50×31 YAXArray{Float64,2} with dimensions:
Dim{:X},
Dim{:Y} Sampled{Float64} 10.0:1.0:40.0 ForwardOrdered Regular Points
Total size: 12.11 KB
julia> @benchmark DD[Y(1:10), X(1)]
BenchmarkTools.Trial: 10000 samples with 991 evaluations.
Range (min … max): 41.751 ns … 407.417 ns ┊ GC (min … max): 0.00% … 85.99%
Time (median): 42.592 ns ┊ GC (median): 0.00%
Time (mean ± σ): 44.705 ns ± 16.278 ns ┊ GC (mean ± σ): 2.39% ± 5.63%
▃▆█▇▄▂ ▂▂▂▂▁ ▁
███████▆▅▄▄▃▄███████▆▆▆▆▅▆▆▆▇▇▇█▇▇█▇▇▆▅▆▆▅▆▆▅▆▆▆▅▅▅▄▅▄▄▄▄▄▅▅ █
41.8 ns Histogram: log(frequency) by time 59.5 ns <
Memory estimate: 240 bytes, allocs estimate: 2.
julia> @benchmark YAX[Dim{:Y}(1:10), Dim{:X}(1)]
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
Range (min … max): 2.366 μs … 187.162 μs ┊ GC (min … max): 0.00% … 97.36%
Time (median): 2.431 μs ┊ GC (median): 0.00%
Time (mean ± σ): 2.572 μs ± 4.234 μs ┊ GC (mean ± σ): 3.97% ± 2.38%
▁▆██▇▆▄▂▂ ▁ ▁▁ ▂
██████████▆▆▄▅▁▃▁▃▃▁▁▁▄▅▇▇██████▇▇▇▆▄▆▆▄▃▃▄▁▆▅▄▅▄▃▅▅▆▆▆▆▆▄▅ █
2.37 μs Histogram: log(frequency) by time 3.42 μs <
Memory estimate: 2.92 KiB, allocs estimate: 39.
that's a 10-fold difference for the above arrays which are small and in memory. but even for a 450MB on-disk zarr array, yax is still 20% slower than dd:
julia> using Zarr
julia> yax = Cube("foo.zarr");
julia> dd = yaxconvert(DimArray, yax);
julia> @benchmark collect(yax[Dim{:LI}(At("bar"))])
BenchmarkTools.Trial: 73 samples with 1 evaluation.
Range (min … max): 52.840 ms … 124.095 ms ┊ GC (min … max): 3.18% … 58.09%
Time (median): 54.923 ms ┊ GC (median): 5.83%
Time (mean ± σ): 68.719 ms ± 25.640 ms ┊ GC (mean ± σ): 24.69% ± 20.59%
▂█
██▇▄▁▁▅▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▃▃▃▃▁▄▃▃ ▁
52.8 ms Histogram: frequency by time 121 ms <
Memory estimate: 126.95 MiB, allocs estimate: 10584.
julia> @benchmark collect(dd[Dim{:LI}(At("bar"))])
BenchmarkTools.Trial: 110 samples with 1 evaluation.
Range (min … max): 44.108 ms … 107.490 ms ┊ GC (min … max): 0.00% … 58.55%
Time (median): 45.175 ms ┊ GC (median): 1.25%
Time (mean ± σ): 45.998 ms ± 6.025 ms ┊ GC (mean ± σ): 2.60% ± 5.66%
▄█
▅▄▆▃▆▄█████▃▃▁▃▃▄▄▃▃▃▁▁▁▃▁▁▁▃▃▁▃▁▁▁▁▁▁▁▁▁▁▁▃▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▃ ▃
44.1 ms Histogram: frequency by time 51.9 ms <
Memory estimate: 38.41 MiB, allocs estimate: 2969.
julia> size(yax)
(20222, 1098, 145)
is this expected?
julia> versioninfo()
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (arm64-apple-darwin22.4.0)
CPU: 12 × Apple M2 Max
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 1 on 8 virtual cores
Environment:
JULIA_PROJECT = @.
JULIA_EDITOR = vi
DimensionalData v0.25.8 and YAXArrays v0.5.2