SparseArrays.jl
SparseArrays.jl copied to clipboard
[Proposal] Generalize SparseVector to allow non-stored value other than zero.
SparseVector is designed assuming that all values that are not stored explicitly are equal to zero.
I have a use case in which the non-stored value should be one(::MyType), rather than zero(::MyType).
You can get this by modifying SparseVector with
neutral(x) = zero(x)
isneutral(x) = iszero(x)
neutrals(::Type{T}, args...) where T = zeros(T, args...)
and replacing most occurrences in sparsevector.jl of zero, iszero, and zeros (for converting to dense vector) with neutral, isneutral, and neutrals. This should leave all current code using SparseVector unchanged. (Except that it appears that there is a small performance hit. I'm not sure why this is the case. I guess this is not essential, and can be fixed.)
Then, I can define neutral(x::MyType) = one(x), etc.
I have started doing this by copying code from sparsevector.jl, renaming SparseVector, and making the edits. But, this seems wasteful.
Of course, you may need to define more methods for your use case. The main difference is in multiplying an element by the neutral element A large amount of the linear algebra code will not give correct results. But, in my case, I don't need most of the linear algebra stuff.
-
A more flexible, but more complicated design might involve traits. Rather than hard code a "neutral" element for a type, add a type parameter to
SparseVectorthat is itself a type. This parameter encodes the neutral element, and perhaps other behavior. -
Are there any use cases for a similar generalization of
SparseArray?
I think it is quite fundamental to a lot of the sparse code that this "neutral" element acts like a zero in the sense of neutral(x) + x = x and neutral(x) * x = neutral(x).
But, in my case, I don't need most of the linear algebra stuff.
In this case, couldn't you just use a Dict with all the non-neutral elements?
I think it is quite fundamental to a lot of the sparse code that this "neutral" element acts like a zero
Yes, this is important. This is one reason I delayed opening this issue. The other is that I guess my use case is not very common. On the other hand, there is a significant amount of code in sparsevector.jl that does work independently of the properties of zero.
In this case, couldn't you just use a Dict
It's convenient to use the Array interface. In fact, I have parametric types that can use either a dense or sparse vector-like object as a field. I can immediately use Vector. But, I need a modified SparseVector. I suppose I could wrap a Dict in a subtype of AbstractVector. I don't have a clear idea of the sizes of my objects and exactly how they will be used, and what implications this has for performance of the sparse vector vs. Dict approaches. But, a big difference is that there are design and optimization choices in SparseVectors that I would have to repeat and debug, etc.
More concretely, I have been working with this
struct OpTerm{W<:AbstractOp, T<:AbstractVector{W}, CoeffT}
ops::T
coeff::CoeffT
end
where AbstractOp represents an operator (say linear operator) on a single degree of freedom (or mode) and the field ops is a Vector of such operators. In particular, AbstractOp might be
struct FermiOp <: AbstractFermiOp
ind::Int
end
or
struct Pauli <: AbstractPauli{Complex{Int}}
hi::Bool
lo::Bool
end
For some applications, it turns out that ops is dense. For others, ops is quite sparse, with most entries being the identity operator. (E.g. electronic Hamiltonians may have many modes, but non-identity operators on at most four modes per term.)
I'm not 100% sure that using sparse vectors here is the right approach, and if so, how much functionality I need. I put this issue up as an RFC. E.g. to see if others have had similar ideas. It's probably best to begin by copying the sparse vector code to a new repo and work on it there.
Here is a quick implementation that is enough to support my current use case.
https://github.com/jlapeyre/SparseArraysN.jl