StructArrays.jl icon indicating copy to clipboard operation
StructArrays.jl copied to clipboard

`replace_storage` but no `eltype` restriction

Open darsnack opened this issue 4 years ago • 3 comments

After working a bit on #175, I feel that replace_storage works well for arrays of primitive types, but it doesn't work well for wrapper array types.

Consider the following two array types:

struct MyVector{T, S} <: AbstractVector{T}
    buffer::S
    # potentially other fields
end
MyVector{T}(N) where T = MyVector{T, Vector{T}}(zeros(T, N))

struct ArrayOfMyVectors{T, N, S<:AbstractArray{T, N}, P} <: AbstractArray{P, N}
    buffer::S
    # inner constructor computes P correctly using Core.Compiler.return_type
end

For array types like ArrayOfMyVectors, it is reasonable to want to return view into buffer wrapped as a MyVector. In other words,

x = ArrayOfMyVectors{T}(...)
typeof(x[1, 1]) == MyVector{T, SubArray{T, 1, S, ...}}

Now, suppose you have the following:

struct Foo{T}
    bar::T
end

x = StructArray(Foo(MyVector{Int}(10)) for _ in 1:2, _ in 1:3)
replace_storage(x) do v
    if v isa Array{<:MyVector}
        return ArrayOfMyVectors{...}(...) # compute based on v
    else
        return v
    end
end

This won't work, because the eltype(x.bar) is originally MyVector{Int, Vector{Int}} and the eltype of the replaced storage is MyVector{Int, SubArray{...}}. Note that unlike https://github.com/JuliaArrays/ArraysOfArrays.jl/issues/2 the eltype/getindex are internally consistent for ArrayOfMyVectors.


So, would it make sense to have a variant (different function or keyword arg) that behaved like replace_storage but returned a StructArray with a totally new eltype? Am I just missing something completely here?

darsnack avatar Apr 14 '21 14:04 darsnack

Sorry I did not reply to your comment on #175, I'll write my thoughts here.

The problem with replace_storage(f, s), if f mutates the eltype, is that it is not clear what the eltype of the outer structarray should be, as it will be the original eltype T where the parameters have to be modified in some way that is hard to determine automatically. If you know how the parameters will change, you can probably roll your own version of replace_storage and update T correctly (the whole function is very simple, see here and the method a few lines up).

Alternatively, one would need to figure out a way to "adjust" the outer type T to the new "fieldtypes" that one gets from calling f and I'm not sure how to do that in general. Maybe it could become a part of the StructArrays interface, a trait that determines how your type T should be updated if the "fieldtypes" change.

piever avatar Apr 15 '21 15:04 piever

Yes, I guess that's true; I don't know how to automatically merge the replaced types into the struct type parameters. Would you be interested in a PR that adds an interface to specify this?

Something like

eltype_from_fields(::Type{<:Foo}, T) = Foo{T}

darsnack avatar Apr 16 '21 13:04 darsnack

Yes, a PR is definitely welcome. As long as we mark this new interface as experimental (IMO the design is not completely clear) I think there's no harm in adding this. In particular, it may require a little bit of tinkering to figure out whether it can also be useful for the collection mechanism in collect_structarray.

piever avatar Apr 16 '21 14:04 piever