AxisKeys.jl icon indicating copy to clipboard operation
AxisKeys.jl copied to clipboard

Tables.jl `wrapdims` doesn't work on `StructArray`

Open jariji opened this issue 6 months ago • 3 comments

It works on DataFrame but not StructArray.

using AxisKeys, DataFrames, StructArrays

julia> wrapdims(DataFrame(a=[1,1,2,2], b=[:x, :y, :x, :y], c=[10, 20, 30, 40]), :c, :a, :b)
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   a ∈ 2-element Vector{Int64}
→   b ∈ 2-element Vector{Symbol}
And data, 2×2 Matrix{Int64}:
      (:x)  (:y)
 (1)  10    20
 (2)  30    40

julia> wrapdims(StructArray(a=[1,1,2,2], b=[:x, :y, :x, :y], c=[10, 20, 30, 40]), :c, :a, :b)
ERROR: ArgumentError: wrong number of names, got (:c, :a, :b) with ndims(A) == 1
Stacktrace:
 [1] check_names(A::StructVector{@NamedTuple{…}, @NamedTuple{…}, Int64}, names::Tuple{Symbol, Symbol, Symbol})
   @ AxisKeys ~/.julia/packages/AxisKeys/sYP4R/src/wrap.jl:103
 [2] wrapdims(::StructVector{@NamedTuple{…}, @NamedTuple{…}, Int64}, ::Symbol, ::Symbol, ::Symbol)
   @ AxisKeys ~/.julia/packages/AxisKeys/sYP4R/src/wrap.jl:85
 [3] top-level scope

jariji avatar Jun 26 '25 06:06 jariji

It's the "array vs table" confusion yet again. Tables.jl tables can be arbitrary types, and it's impossible to distinguish them from arrays (because arrays can be tables). So, any package attempting to handle both in the same function comes to this kind of confusion – another issue I remember because I filed it is https://github.com/JuliaArrays/StructArrays.jl/issues/278. According to the docs, wrapdims takes either a table or an array to wrap, and for StructArray it dispatches to the array version.

aplavin avatar Jun 26 '25 08:06 aplavin

I see, wrapdims(::StructArray, ...) dispatches to wrapdims(::AbstractArray, ...). Then I think exporting

wraptable(table, value, names...; kw...) = _wrap_table(KeyedArray, identity, table, value, names...; kw...)

would be a solution.

julia> let table = StructArray(a=[1,1,2,2], b=[:x, :y, :x, :y], c=[10, 20, 30, 40])
           value = :c
           names = (:a, :b)
           kw = (;)
           AxisKeys._wrap_table(KeyedArray, identity, table, value, names...; kw...)
       end
2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   a ∈ 2-element Vector{Int64}
→   b ∈ 2-element Vector{Symbol}
And data, 2×2 Matrix{Int64}:
      (:x)  (:y)
 (1)  10    20
 (2)  30    40

jariji avatar Jun 26 '25 17:06 jariji

Good point! Maybe that new wraptable can become the recommended way to go from table to KA, and also include fixes for https://github.com/mcabbott/AxisKeys.jl/issues/168 and https://github.com/mcabbott/AxisKeys.jl/issues/105 without any breaking changes.

aplavin avatar Jun 27 '25 07:06 aplavin