ArrayInterface.jl
ArrayInterface.jl copied to clipboard
Remove unnecessary methods from `setindex`
The current implementation of setindex is problematic for the following reasons:
- It is not compatible with
OffsetArrays. - It is not compatible with non-numeric types such as
Array{string}. - It will produce ambiguities with StaticArrays.jl (https://github.com/JuliaArrays/StaticArrays.jl/issues/1039).
Before this PR
julia> using ArrayInterface
julia> x = fill("a", 2, 3)
2×3 Matrix{String}:
"a" "a" "a"
"a" "a" "a"
julia> Base.setindex(x, "b", 2, 1)
ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 3 and 2")
Stacktrace:
[1] _bcs1
@ ./broadcast.jl:516 [inlined]
[2] _bcs (repeats 2 times)
@ ./broadcast.jl:510 [inlined]
[3] broadcast_shape
@ ./broadcast.jl:504 [inlined]
[4] combine_axes
@ ./broadcast.jl:499 [inlined]
[5] _axes
@ ./broadcast.jl:224 [inlined]
[6] axes
@ ./broadcast.jl:222 [inlined]
[7] combine_axes
@ ./broadcast.jl:499 [inlined]
[8] instantiate
@ ./broadcast.jl:281 [inlined]
[9] materialize
@ ./broadcast.jl:860 [inlined]
[10] setindex(x::Matrix{String}, v::String, i::Int64, j::Int64)
@ ArrayInterfaceCore ~/.julia/packages/ArrayInterfaceCore/wwYvJ/src/ArrayInterfaceCore.jl:131
[11] top-level scope
@ REPL[4]:1
After this PR
julia> using ArrayInterface
julia> x = fill("a", 2, 3)
2×3 Matrix{String}:
"a" "a" "a"
"a" "a" "a"
julia> Base.setindex(x, "b", 2, 1)
2×3 Matrix{String}:
"a" "a" "a"
"b" "a" "a"
Before this PR
julia> x = fill(10., 5)
5-element Vector{Float64}:
10.0
10.0
10.0
10.0
10.0
julia> @benchmark Base.setindex(x, 2., 2)
BenchmarkTools.Trial: 10000 samples with 986 evaluations.
Range (min … max): 51.406 ns … 1.120 μs ┊ GC (min … max): 0.00% … 93.28%
Time (median): 53.611 ns ┊ GC (median): 0.00%
Time (mean ± σ): 56.940 ns ± 46.035 ns ┊ GC (mean ± σ): 3.58% ± 4.23%
▄▆██▇▆▄▂▁▁▁▁▁ ▁ ▂
▇██████████████▇▇▅▅▆▄▅▅▃▃▄▅▃▁▄▁▃▃▁▃▁▁▃▁▁▁▄▆███████▇▆▆▆▆▇█▇█ █
51.4 ns Histogram: log(frequency) by time 81.3 ns <
Memory estimate: 96 bytes, allocs estimate: 1.
After this PR
julia> x = fill(10., 5)
5-element Vector{Float64}:
10.0
10.0
10.0
10.0
10.0
julia> @benchmark Base.setindex(x, 2., 2)
BenchmarkTools.Trial: 10000 samples with 991 evaluations.
Range (min … max): 41.592 ns … 1.076 μs ┊ GC (min … max): 0.00% … 95.34%
Time (median): 48.921 ns ┊ GC (median): 0.00%
Time (mean ± σ): 51.771 ns ± 43.928 ns ┊ GC (mean ± σ): 3.77% ± 4.25%
▂▂▃▃▁ ▂▅▇█▇▆▃▂▁▂▁▁ ▁▁ ▂
█████▇▆▅▆█████████████▇▇▆▆▅▅▆▅▃▁▅▄▄▄▄▄▆▃▃▃▄▅▆▇██████▇▆▆▆▇▇▇ █
41.6 ns Histogram: log(frequency) by time 75.6 ns <
Memory estimate: 96 bytes, allocs estimate: 1.
This PR also fixes setindex for non-square matrix
Before this PR
julia> a = ones(3,3)
3×3 Matrix{Float64}:
1.0 1.0 1.0
1.0 1.0 1.0
1.0 1.0 1.0
julia> Base.setindex(a,0.0,3,2)
3×3 Matrix{Float64}:
1.0 1.0 1.0
1.0 1.0 1.0
0.0 0.0 0.0
julia> a = ones(3,4)
3×4 Matrix{Float64}:
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
julia> Base.setindex(a,0.0,3,2)
ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 4 and 2")
Stacktrace:
[1] _bcs1
@ ./broadcast.jl:516 [inlined]
[2] _bcs (repeats 2 times)
@ ./broadcast.jl:510 [inlined]
[3] broadcast_shape
@ ./broadcast.jl:504 [inlined]
[4] combine_axes
@ ./broadcast.jl:499 [inlined]
[5] _axes
@ ./broadcast.jl:224 [inlined]
[6] axes
@ ./broadcast.jl:222 [inlined]
[7] combine_axes
@ ./broadcast.jl:499 [inlined]
[8] instantiate
@ ./broadcast.jl:281 [inlined]
[9] materialize
@ ./broadcast.jl:860 [inlined]
[10] setindex(x::Matrix{Float64}, v::Float64, i::Int64, j::Int64)
@ ArrayInterfaceCore ~/.julia/packages/ArrayInterfaceCore/wwYvJ/src/ArrayInterfaceCore.jl:131
[11] top-level scope
@ REPL[10]:1
After this PR
julia> a = ones(3,3)
3×3 Matrix{Float64}:
1.0 1.0 1.0
1.0 1.0 1.0
1.0 1.0 1.0
julia> Base.setindex(a,0.0,3,2)
3×3 Matrix{Float64}:
1.0 1.0 1.0
1.0 1.0 1.0
1.0 0.0 1.0
julia> a = ones(3,4)
3×4 Matrix{Float64}:
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
julia> Base.setindex(a,0.0,3,2)
3×4 Matrix{Float64}:
1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0
1.0 0.0 1.0 1.0
IIRC, this will break codes that need to be non-mutating on Array for compatibility with automatic differentiation. One example is trying to do Zygote over a code like https://github.com/JuliaDiff/FiniteDiff.jl/blob/cfee15e33394067f883fb685ea9f305d0bce9f0f/src/jacobians.jl#L196 . It at least was a missing function definition in Base and ideally these definitions would get upstreamed, maybe they already did. IIRC setindex used to just error by default on Arrays and it was a functionality that only Tuples and StaticArrays had?
Codecov Report
Merging #305 (aa937a4) into master (1fb77f5) will not change coverage. The diff coverage is
n/a.
@@ Coverage Diff @@
## master #305 +/- ##
=======================================
Coverage 91.79% 91.79%
=======================================
Files 9 9
Lines 1413 1413
=======================================
Hits 1297 1297
Misses 116 116
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 1fb77f5...aa937a4. Read the comment docs.
Julia doesn't have this yet (see https://github.com/JuliaLang/julia/pull/33495 ). Anyway, that PR looks much better to me than the code here. Zygote compatibility should be easily possible by adding an rrule for setindex.
We can add the code from the PR here. Or @tkf do you think you can get this finished for Base?
Anyway, that PR looks much better to me than the code here.
It's not AD compatible so it would cause downstream failures.
Zygote compatibility should be easily possible by adding an rrule for setindex.
That would be fine. If someone defines an rrule to delete these that's probably a better option. But I don't think we should merge something that we know is going to be downstream breaking like this, we should at least wait until a safer version is made, however it's done.
That would be fine. If someone defines an rrule to delete these that's probably a better option. But I don't think we should merge something that we know is going to be downstream breaking like this, we should at least wait until a safer version is made, however it's done.
I'm not much familiar with Zygote.jl and related AD systems. I have some questions:
- Could you provide a MWE for the downstream breaking?
- Should we add some tests for AD and update ChainRulesTestUtils.jl in
[extras]table inlib/ArrayInterfaceCore/Project.toml? - Should we separate this PR into removing methods in
setindexand AD-related things?
One example is trying to do Zygote over a code like https://github.com/JuliaDiff/FiniteDiff.jl/blob/cfee15e33394067f883fb685ea9f305d0bce9f0f/src/jacobians.jl#L196 .
I have tested FiniteDiff/test/runtests.jl with changes in this PR in my local PC, but the test didn't produce errors.
Yes, of course that doesn't because it doesn't hit Zygote/ReverseDiff on the finite differencing in that test suite. But there are some clear cases where this will cause failures. If you differentiate the solver in OOP form and put it into finite difference mode then a mutating setindex will cause a failure. So probably the clearest example of this would be one of the DiffEqSensitivity.jl in the oop function adjoint test with sensealg =ReverseDiffAdjoint() with a solver like TRBDF2(autodiff=false). And this is why we have a defensive practice to not take code that violates these kinds of principles:
Functions should either attempt to be non-allocating and reuse caches, or treat inputs as immutable Mutating codes and non-mutating codes fall into different worlds. When a code is fully immutable, the compiler can better > reason about dependencies, optimize the code, and check for correctness. However, many times a code making the fullest > use of mutation can outperform even what the best compilers of today can generate. That said, the worst of all worlds is when code mixes mutation with non-mutating code. Not only is this a mishmash of coding styles, it has the potential non-> locality and compiler proof issues of mutating code while not fully benefiting from the mutation.
https://github.com/SciML/SciMLStyle#functions-should-either-attempt-to-be-non-allocating-and-reuse-caches-or-treat-inputs-as-immutable
We should probably expand the downstream tests to include DiffEqSensitivity Core1/2/3/4/5 because those tests would probably be the most prone to catching interface breaks since they rely on the interfaces being correct (since they need to impose AD + GPU + ... constraints all at once).
Rebase after https://github.com/JuliaArrays/ArrayInterface.jl/pull/313 for the new tests, and that might see the break. Otherwise I'll explicitly add something to show the break (but I'm traveling right now so it might now happen ASAP)
I just rebased to the current master branch yesterday. I'm waiting for approval to run the workflows.
The simplest thing to do might be to add a ChainRules.rrule for setindex to ChainRules.jl. That should be rather simple to define, and if that merges then I'm 👍 on this. I still want to find the time to construct the case that hits this though, since it's a deep and nasty one to find/fix, but if we have the chain rule then it will go away.
Downstream has been handled.
Thanks for the merging!