StructArrays.jl icon indicating copy to clipboard operation
StructArrays.jl copied to clipboard

Allow columns with variable getindex return types

Open colinxs opened this issue 6 years ago • 4 comments

There's an interesting new package floating about that has been quite useful: https://github.com/oschulz/ArraysOfArrays.jl. The gist is that it allows viewing an M+N dimensional array as an M-dimensional array of N-dimensional arrays. To achieve this, getindex returns a view of the underlying data, which has a variable (but stable) return type. StructArrays relies on a constant return type in createinstance. This leads to some odd behavior:

julia> s=StructArray(x=nestedview(rand(2,3,10), 2));
julia> s[1].x .= 2;
julia> s[1].x
2×3 Array{Float64,2}:
 0.979757  0.282793  0.948475
 0.850188  0.611789  0.959696

This is because:

julia> eltype(s.x)
Array{Float64,2}

So in createinstance a call to T(args), corresponding to NamedTuple{(:x,),Tuple{Array{Float64,2}}(s.x[1]) is made, but typeof(s.x[1])<:SubArray so a copy of the data is made.

This PR fixes the above issue but introduces a new one: now the inferred return type for getindex(::StructArray, ...) is Any.

-Colin

colinxs avatar Jun 27 '19 05:06 colinxs

Added a @generated version of get_ith so that its return type can be inferred.

colinxs avatar Jun 27 '19 05:06 colinxs

Codecov Report

Merging #84 into master will decrease coverage by 0.19%. The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #84     +/-   ##
=========================================
- Coverage   91.08%   90.88%   -0.2%     
=========================================
  Files           9        9             
  Lines         370      373      +3     
=========================================
+ Hits          337      339      +2     
- Misses         33       34      +1
Impacted Files Coverage Δ
src/structarray.jl 86.86% <100%> (+0.27%) :arrow_up:
src/interface.jl 71.42% <50%> (-11.91%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d22a69e...862af3a. Read the comment docs.

codecov-io avatar Jun 27 '19 05:06 codecov-io

Thanks for the PR! In general I'm hoping to reduce the number of generated functions. I used to have generated getindex but it turns out it was not needed. Here I think the problem is a bug with ArrayOfArrays.jl:

julia> x=nestedview(rand(2,3,10), 2);

julia> x[1] isa eltype(x)
false

Shouldn't ArrayOfSimilarArrays have the correct eltype instead?

piever avatar Jul 01 '19 11:07 piever

Sorry for the delay, been a bit busy!

I entirely see you point. I brought the issue up in ArraysOfArrays here. Would you mind leaving this issue open until a fix (either here or there) is implemented?

Thanks! Colin

colinxs avatar Jul 17 '19 20:07 colinxs