SparseArrays.jl icon indicating copy to clipboard operation
SparseArrays.jl copied to clipboard

Why is `reinterpret` prohibited?

Open JeffFessler opened this issue 2 years ago • 11 comments

I have a Unitful sparse array that I would like to reinterpret as a non-unitful sparse array so that I can pass it to a method that sadly will not accept unitful spars data (lldl in LimitedLDLFactorizations FWIW).

But for some (undocumented?) reason this package is (unnecessarily?) completely prohibiting reinterpret here: https://github.com/JuliaSparse/SparseArrays.jl/blob/311b4b4130d9f28a6b5eb55cb7c818e4f7858719/src/abstractsparse.jl#L80

Here is my MWE of what I would like to do (which fails at the above line):

using Unitful: s
using SparseArrays
A = spdiagm(Float32.(1:3)*s^2) # unitful sparse array
T = eltype(A / oneunit(A))
B = reinterpret(T, A)

My view is that reinterpret is something that one uses with some trepidation and this package should let users attempt it at their own risk, instead of prohibiting it. Specifically I would like to allow reinterprets from SparseMatrixCSC{T1, Int64} to SparseMatrixCSC{T2, Int64} or some generalization thereof. Is there openness to some cases of reinterpret being allowed? Or is there some huge risk I am overlooking?

JeffFessler avatar Nov 21 '22 15:11 JeffFessler

Maybe look at the git blame for that to see if it points to some discussion? I suppose that reinterpret on values would be quite usable.

ViralBShah avatar Nov 21 '22 15:11 ViralBShah

Follow up: I tried to write my own specialized method to circumvent the general prohibition, but the method failed because reinterpret of the nonzero values returns an Base.ReinterpretArray type that is a AbstractVector whereas the constructor for SparseMatrixCSC currently requires a Vector here:

https://github.com/JuliaSparse/SparseArrays.jl/blob/311b4b4130d9f28a6b5eb55cb7c818e4f7858719/src/sparsematrix.jl#L27

Perhaps that could be relaxed with suitable checking of 1-based indexing etc.

import Base: reinterpret
using SparseArrays
function reinterpret(E::DataType, A::SparseMatrixCSC{<:Number, Int64})
    tmp = reinterpret(E, A.nzval)
    SparseMatrixCSC(A.m, A.n, A.colptr, A.rowval, tmp) 
end

I will look into the git history!

JeffFessler avatar Nov 21 '22 15:11 JeffFessler

I found a deprecate that points back to a 2017 issue: https://github.com/JuliaLang/julia/issues/22849 More research needed...

JeffFessler avatar Nov 21 '22 19:11 JeffFessler

Switching to AbstractVector would require a new parameter, and to avoid breakage we didn't want to introduce that in the 1.x series. The parameter isn't so much an issue as downstream packages expecting Vector not AbstractVector

We could look into doing that for 1.10, or perhaps as a separate package (I'll have more on that later this week).

rayegun avatar Nov 21 '22 20:11 rayegun

I concocted the following kludgey work-around for now using Base.unsafe_wrap 😬

import Base: reinterpret
using SparseArrays
function reinterpret(T::DataType, A::SparseMatrixCSC{<:Number, Int64})
    tmp = reinterpret(T, A.nzval)
    v = Base.unsafe_wrap(Array, pointer(tmp), length(tmp))
    SparseMatrixCSC(A.m, A.n, A.colptr, A.rowval, v)
end

using Unitful: s
A = spdiagm(Float32.(1:3)*s^2) # unitful sparse array
T = eltype(A / oneunit(A))
B = reinterpret(T, A)

I look forward to better options :)

JeffFessler avatar Nov 22 '22 14:11 JeffFessler

This is really dangerous I think. A must stay alive otherwise you will segfault.

rayegun avatar Nov 22 '22 17:11 rayegun

If there is a specialized method you want to avoid calling, invoke is your friend.

using Unitful: s
using SparseArrays
A = spdiagm(Float32.(1:3)*s^2) # unitful sparse array
T = eltype(A / oneunit(A))
B = invoke(reinterpret, Tuple{Type{T}, AbstractArray{eltype(A), ndims(A)}}, T, A)

While this solves the OP's use case, it does not close the issue of "Why is reinterpret prohibited?"

The git blame on this is hard to find, but here it is: https://github.com/JuliaLang/julia/pull/23750#issuecomment-334969352. I was just looking into it because I came across this issue in a different context.

Fundamentally, the issue is this:

julia> s = spzeros(3)
3-element SparseVector{Float64, Int64} with 0 stored entries

julia> rs = invoke(reinterpret, Tuple{Type{Int}, AbstractVector{Float64}}, Int, s)
3-element reinterpret(Int64, ::SparseVector{Float64, Int64}):
 0
 0
 0

julia> x = reinterpret(Int, -0.0)
-9223372036854775808

julia> rs[1] = x # Line 4
-9223372036854775808

julia> rs[1] == x
false

julia> rs
3-element reinterpret(Int64, ::SparseVector{Float64, Int64}):
 0
 0
 0

On line 4, we reinterpret x back into a Float64 (-0.0), which iszero (iszero(-0.0)) so we don't store it. This is acceptable for sparse arrays of Float64s because 0.0 == -0.0, but it's kinda weird for this reinterpreted array because the Int -9223372036854775808 is very much not equal to 0.

I think that this total prohibition is excessive and would prefer to throw only when performing a setindex that suffers from the above issue.

LilithHafner avatar Nov 24 '22 05:11 LilithHafner

Implementing the behavior @StefanKarpinski suggested in https://github.com/JuliaSparse/SparseArrays.jl/issues/55#issuecomment-326281890 would fix this problem and allow us to use standard reinterpret plumbing with no modifications.

LilithHafner avatar Nov 24 '22 08:11 LilithHafner

Hi @LilithHafner thanks so much for explaining about invoke and the history and for drafting a PR on this topic. Unfortunately the invoke returns an AbstractArray that appears to drop awareness of the sparsity, so I think I will (eventually) need a reinterpret for sparse matrices. In my case I would be perfectly happy with an immutable sparse array so that the assignment problem would be irrelevant, but maybe that is too specialized.

JeffFessler avatar Nov 27 '22 21:11 JeffFessler

julia> x = reinterpret(Int, -0.0)

IMHO anyone who does this should expect to get weird results!

JeffFessler avatar Nov 27 '22 21:11 JeffFessler

IMHO anyone who does this should expect to get weird results!

reinterpret has well-defined semantics, and so long as we conform to them it allows for certain optimizations. It's useful for, among other things, sorting:

https://github.com/JuliaLang/julia/blob/5495b8d67a66720559cfd8c13ebb315a80e4e579/base/sort.jl#L1495

I think I will (eventually) need a reinterpret for sparse matrices.

Agreed

In my case I would be perfectly happy with an immutable sparse array so that the assignment problem would be irrelevant,

Half a loaf is better than no loaf.

LilithHafner avatar Nov 28 '22 00:11 LilithHafner

@LilithHafner thank you for your work on this and your persistence in the PR :) So that I can plan ahead, will this appear in 1.9 or not until 1.10 or later?

JeffFessler avatar Jan 21 '23 03:01 JeffFessler

1.10. we need to get it onto Julia master with other things and let it all get tested out there.

ViralBShah avatar Jan 21 '23 09:01 ViralBShah

No, this will be backported to v1.9 for some sorting reason, I believe.

dkarrasch avatar Jan 21 '23 11:01 dkarrasch

Yep. It fixes this regression: https://github.com/JuliaSparse/SparseArrays.jl/issues/304 & https://github.com/JuliaLang/julia/issues/48079

LilithHafner avatar Jan 21 '23 14:01 LilithHafner