ArraysOfArrays.jl icon indicating copy to clipboard operation
ArraysOfArrays.jl copied to clipboard

getindex(.) return type does not match eltype

Open colinxs opened this issue 6 years ago • 71 comments
trafficstars

First off, awesome idea/package! It's been quite useful in my own work.

I was really excited to use your package in conjunction with StructArrays but ran into an issue (that I attempted to fix with a PR to StructArrays) as explained here. The owner of that package pointed out, however, that the real issue is the following:

julia> x=nestedview(rand(2,3,10), 2);
julia> x[1] isa eltype(x)
false

What are your thoughts on modifying eltype to be consistent with the true return type of getindex?

-Colin

colinxs avatar Jul 17 '19 20:07 colinxs

Thanks!

Yes, the getindex/eltype inconsistency is annoying, esp. when using it with StructArray or TypedTable.

I definitely want to solve that, but it's a bit tricky - just using the underlying array type as eltype is not enough, because we get into trouble (e.g. with broadcast) if it's a view.

oschulz avatar Jul 18 '19 05:07 oschulz

There's also this problem: If we just make eltype returns the right thing, then there would be a mismatch between eltype and the element type deduced from T in foo(A::AbstractArray{T) on an ArraysOfArrays array. I think a lot of code is written based on the assumtion that T and eltype(A) match.

oschulz avatar Jul 18 '19 07:07 oschulz

Just to pitch in (as this was brought up in https://github.com/piever/StructArrays.jl/pull/84), I think that changing the eltype should be done not by adding a custom eltype method but by giving the correct type parameter to AbstractArray, i.e. ArrayOfArray<:AbstractArray{SubArray...}. Implementation-wise, this should be an extra type parameter of ArrayOfArray that gets computed in an internal constructor.

piever avatar Jul 18 '19 10:07 piever

I fully agree, @piever. That's what makes it a bit more tricky - but I have an idea (like you said, via constructor), I'll try it out.

oschulz avatar Jul 18 '19 17:07 oschulz

@piever, @colinxs, haven't forgotten about this - holiday season, but it's definitely on my short-term ToDo list.

oschulz avatar Aug 21 '19 07:08 oschulz

Update: Haven't forgotten about this, just had to be put on hold for some other urgent work. Will get back on this soon.

oschulz avatar Oct 21 '19 14:10 oschulz

Sorry this is taking so long - it's not forgotten and definitely still on my to-do list!

oschulz avatar Mar 17 '20 22:03 oschulz

@oschulz no worries, I actually figured this out and have been using it in my own codebases:

# For AbstractArray A and indices I, compute V = typeof(view(A, I...)) by:
# 1) Try to infer V from typeof(A) and typeof(I) by calling viewtype(typeof(A), typeof(I))
# 2) If !isconcretetype(V), fall back to typeof(view(A, I...))
# Custom subtypes of AbstractArray (e.g. UnsafeArray) could provide customized implementations
# of viewtype(A::AbstractArray, I::Tuple) or viewtype(::Type{<:AbstractArray}, ::Type{<:Tuple})
# if required.

function viewtype(A::AbstractArray, I::Tuple)
    T = viewtype(typeof(A), typeof(I))
    _viewtype(A, I, T, Val(isconcretetype(T)))
end
viewtype(A::AbstractArray, I...) = viewtype(A, I)

function viewtype(::Type{A}, ::Type{I}) where {A<:AbstractArray,I<:Tuple}
    Core.Compiler.return_type(view, _view_signature(A, I))
end
Base.@pure _view_signature(::Type{A}, ::Type{I}) where {A,I<:Tuple} = Tuple{A, I.parameters...}

_viewtype(A, I, T, ::Val{true}) = T
function _viewtype(A, I, T, ::Val{false})
    try
        return typeof(view(A, I...))
    catch e
        Istring = join(map(string, I), ", ")
        printstyled(stderr,
        """
        Unable to infer typeof(view($(typeof(A)), $(join(map(i -> string(typeof(i)), I), ", "))))
        Only other option is to try typeof(view(A, $Istring)) but that resulted in below error.
        Try passing in valid indices or implement:
            viewtype(::Type{$(typeof(A))}, ::Type{$(typeof(I))})
        """, color=:light_red)
        rethrow(e)
    end
end

I've got an in-works implementation something similar to ArrayOfSimilarArrays that I'll make a PR for.

Thanks again for keeping on top of this! :)

colinxs avatar Mar 18 '20 00:03 colinxs

Also, two other options for _viewtype(A, I, T, ::Val{false}) that I considered were:

_viewtype(A, I, T, ::Val{false}) = @inbounds view(A, I...)
@propagate_inbounds _viewtype(A, I, T, ::Val{false}) = view(A, I...)

The first has the benefit of working for empty arrays, but that seems unsafe. The second option I rejected because that would require e.g. @inbounds nestedview(A_flat, 2), which seemed weird to do when calling a constructor.

colinxs avatar Mar 18 '20 00:03 colinxs

Thanks, I really got to add this soon!

oschulz avatar Apr 14 '20 21:04 oschulz

As an alternate idea, you could use the SlicedArray that I made here.

I ended up needing the ability to slice along arbitrary dimensions instead just the inner M dimensions and ran across JuliennedArrays.jl. I took the best parts of that and ArraysOfArrays to make SlicedArray which I use internally for Lyceum. An ArrayOfArrays{T,M} could be implemented as:

function sliceinner(A::AbstractArray{<:Any,L}, ::Val{M}) where {L,M}
  slice(A, ntuple(_ -> Colon(), Val(M))..., ntuple(_ -> *, Val(L-M))...)
end

You could also use JuliennedArrays directly, SlicedArray is the same idea just with more overloads for Base functions, comprehensive tests, and better performance. I'd like to upstream those changes in the future but haven't had time yet. If this sounds useful I could register SpecialArrays (currently it's only registered in the Lyceum registry).

As an aside, you might find some of the test utilities I wrote useful, given that you have a lot of packages providing various custom arrays. That test suite is what originally prompted the changes to ElasticArrays I made in https://github.com/JuliaArrays/ElasticArrays.jl/pull/20. Perhaps factoring that out to something like ArrayTestTools.jl would be useful to JuliaArrays/others?

colinxs avatar Apr 14 '20 21:04 colinxs

I could register SpecialArrays

Hm, since ArraysOfArrays is already registered, and since there would be conflicts with some exported functions (like innersize), I guess I would prefer to backport your improvements to ArrayOfSimilarArrays. If the package has to evolve in the process (incl. some API changes), that would be fine. I've actually been planning to move it under the "JuliaArrays" organization to make it a bit more "official" - talked to Tim Holy about it a JuliaCon, just haven't gotten 'round to it yet.

Perhaps factoring that out to something like ArrayTestTools.jl would be useful to JuliaArrays/others?

Yes, that might indeed be very useful!

oschulz avatar Apr 14 '20 22:04 oschulz

Makes sense to backport the changes ArraysOfArrays.jl! Given that SlicedArray (and also FlattenedArray, which is a flattened view of a nested array) were inspired largely by JuliennedArrays.jl it'd be good to check with its dev (@bramtayl) first? The way forward could then be to upstream SlicedArray/FlattenedArray to JuliennedArrays.jl and then have ArraysOfArrays.jl depend on that? The alternative would be to just put everything in ArraysOfArrays.jl directly.

In the meantime, I'll work on factoring out ArrayTestTools.jl.

colinxs avatar Apr 14 '20 22:04 colinxs

It makes me so happy when I find out people are reading my packages! Happy to accept PRs to JuliennedArrays

bramtayl avatar Apr 14 '20 22:04 bramtayl

@bramtayl, I just came across JuliennedArrays a while ago - from what I saw, it's quite similar to ArrayOfSimilarArrays, but trades increased flexibility (slicing in any dimension) for some performance?

Would you possibly be interested in moving it into ArraysOfArrays, or would you prefer to keep it as a lightweight separate package?

oschulz avatar Apr 14 '20 23:04 oschulz

@colinxs having a FlattenedArray in ArraysOfArrays would be great - I've been meaning to add something like that, to replace the current generic implementation of flatview

@inline flatview(A::AbstractArray{<:AbstractArray}) = Base.Iterators.flatten(A)

with something that actually returns an AbstractArray-subtype. One of the things that kept sliding to the bottom of my to-do list ... :-)

oschulz avatar Apr 14 '20 23:04 oschulz

@oschultz I'd be thrilled to move it into ArraysOfArrays so I don't have to maintain the code anymore. I'd be curious to know more about performance differences

bramtayl avatar Apr 14 '20 23:04 bramtayl

hahah and I love when devs are super responsive and eager to accept PRs :).

@bramtayl, when you have a moment want to look over SlicedArray and FlattenedArray. The major changes are:

Slices

  • renamed to SlicedArray to match Base's naming convention (SubArray, PermutedDimsArray)
  • Exposes slice(A, dims::Tuple), slice(A, dims::Integer...), and slice(A, Union{Colon,typeof(*)}) as the user-facing API (similar to SubArray and view). slice(A, :, *, :) would be equivalent to Slices(A, True(), False(), True()), it's just syntactical sugar.
  • Adds several methods for Base functions like resize!, append! etc.
  • Adds a drop-in replacement for Base.mapslices
  • Adds adjoints for Zygote.jl (conditionally loaded using Requires.jl)

Align

  • Renamed to FlattenedArray
  • Exposes flatview(A) as the user-facing API
  • Removed the ability to specify alongs since you can get the same functionality by composing with PermutedDimsArray
  • Adds adjoints for Zygote.jl

I also significantly re-wrote the internals for better inference/performance and added full test coverage.

colinxs avatar Apr 14 '20 23:04 colinxs

Concerning the performance difference: there isn't any. All of the fancy indexing that allows slicing along arbitrary dimensions effectively compiles out.

colinxs avatar Apr 14 '20 23:04 colinxs

@colinxs those all sound like improvements to me. Earlier versions of JuliennedArrays used * and : to specify dimensions (I think that was a Tim Holy suggestion). I changed to True and False because I got excited about the wider potential of Typed Bools (hasn't panned out yet). I'm not sure I understand about PermutedDimsArray.

bramtayl avatar Apr 14 '20 23:04 bramtayl

I'd be thrilled to move it into ArraysOfArrays so I don't have to maintain the code anymore.

Haha, and here I thought we'd maintain it together. ;-)

I'd be curious to know more about performance differences

Good point, we should definitely do some benchmarking!

oschulz avatar Apr 14 '20 23:04 oschulz

Regarding PermutedDimsArray - when I wrote ArrayOfSimilarArrays I actually thought that applications that needed to slice along another axis could us an ArrayOfSimilarArrays wrapped around a PermutedDimsArray. Would be interesting to see if that could achieve similar performance to JuliennedArrays.Slices. If so, maybe we could reduce code there quite a bit and make things easier to maintain if we join it all in ArraysOfArrays.

oschulz avatar Apr 14 '20 23:04 oschulz

Adds adjoints for Zygote.jl

Oh, that would be nice to have here (ArraysOfArrays) too!

oschulz avatar Apr 14 '20 23:04 oschulz

@bramtayl PermutedDimsArray just remaps dimensions e.g.:

julia> A=rand(2,3)
2×3 Array{Float64,2}:
 0.307533  0.822543  0.530326
 0.591914  0.782564  0.948809

julia> B=PermutedDimsArray(A, (2,1))
3×2 PermutedDimsArray(::Array{Float64,2}, (2, 1)) with eltype Float64:
 0.307533  0.591914
 0.822543  0.782564
 0.530326  0.948809

julia> A[1,2] == B[2,1]
true

So if you had A = [rand(2,3) for i=1:4], Align(A, True(), True(), False()) and PermutedDimsArray(Align(A, False(), True(), True()), (2, 3, 1)) would be equal.

colinxs avatar Apr 15 '20 00:04 colinxs

Oh I see. Yes that makes sense to me. I'm not sure if there would be a performance penalty for PermutedDimsArray?

bramtayl avatar Apr 15 '20 00:04 bramtayl

@oschulz

I thought about doing the same thing, but you miss out on some optimizations with the extra layer of indirection. If we had:

A = rand(2,3,4)
B = slice(A, :, *, *)

Then doing B[:, 1] as implemented is basically a no-op because we can just forward the Colon to the parent array and "re-slice". See this comment and the subsequent getindex definition for an example.

If you rely on PermutedDimsArray, then you'd be falling back to Cartesian indexing and collecting the output in a new array.

colinxs avatar Apr 15 '20 00:04 colinxs

@bramtayl

SlicedArray still uses True/False internally, but I figured using :/* was a friendlier user-facing API.

colinxs avatar Apr 15 '20 00:04 colinxs

I like : / *!

But in your example above - slice(A, :, *, *) could already be done with just an ArrayOfSimilarArrays, without PermutedDimsArray, right? Or do if mix : and * up, here`?

oschulz avatar Apr 15 '20 00:04 oschulz

In any case, if a more generic version of ArrayOfSimilarArrays is possible without loss of performance, we could add it to ArraysOfArrays and reduce ArrayOfSimilarArrays to a type alias (possibly with an optimized implementation for some functions, if necessary).

oschulz avatar Apr 15 '20 00:04 oschulz

@oschulz exactly, but ArrayOfSimilarArrays is just a special-case of a SlicedArray or @bramtayl's Slices.

Basically, if we incorporate SlicedArray/Slices into ArraysOfArrays.jl then it would entirely replace ArrayOfSimilarArrays.

@oschulz @bramtayl given the above improvements I made would you all be comfortable incorporating FlattenedArray/SlicedArray instead of Align/Slices?

colinxs avatar Apr 15 '20 00:04 colinxs