SparseArrays.jl add the iteratenz API

I couldn't find anything like this so i thought i'd make a small implementation. the idea is that this makes iteration on sparsearrays much easier, no need to deal with colptr & friends. It's still pretty fast. But this begs the question, has this been in the api and did i miss it?

This should make adding special matrice operations easy.

julia> using BenchmarkTools, SparseArrays, LinearAlgebra

julia> A = I + sprandn(10_000, 10_000, 0.001);

julia> function f(A)
           c = zero(eltype(A))
           for (i, j, v) in SparseArrays.iternz(A)
               if (i + j) % 2 == 0
                   c += v 
               end
           end
           return c
       end
f (generic function with 1 method)

julia> function g(A)
           c = zero(eltype(A))
           for (i, j, v) in zip(findnz(A)...)
               if (i + j) % 2 == 0
                   c += v 
               end
           end
           return c
       end
g (generic function with 1 method)

julia> function h(A)
           c = zero(eltype(A))
           @inbounds for j in axes(A, 2),
               ind in A.colptr[j]: A.colptr[j + 1] - 1
               i = A.rowval[ind]
               if (i + j) % 2 == 0
                   c += A.nzval[ind]
               end
           end
       
           c
       end
h (generic function with 1 method)

julia> function k(A)
           c = zero(eltype(A))
           cptr = A.colptr
           rv = A.rowval
           nzv = A.nzval
           @inbounds for j in axes(A, 2),
               ind in cptr[j]: cptr[j + 1] - 1
               i = rv[ind]
               if (i + j) % 2 == 0
                   c += nzv[ind]
               end
           end
       
           c
       end 
k (generic function with 1 method)

julia> function l(A)
           c = zero(eltype(A))
           @inbounds for j in axes(A, 2),
               ind in SparseArrays.getcolptr(A)[j]:SparseArrays.getcolptr(A)[j + 1] - 1
               i = SparseArrays.getrowval(A)[ind]
               if (i + j) % 2 == 0
                   c += SparseArrays.getnzval(A)[ind]
               end
           end
           c
       end 
l (generic function with 1 method)

julia> @assert f(A) == g(A) == h(A) == k(A) == l(A)

julia> @benchmark f($A)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  149.655 μs … 272.489 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     156.244 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   162.705 μs ±  18.228 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▃▁▁▃▄▂▂▂▂▇▃▂▁▁▁▃▁▁▁▁▁                           ▁▄          ▂
  █████████████████████████▇▇▇██▆▅▆▆▆▆▆▅▆▆▆▅▆▅▅▄▅▅▆▆██▇▄▅▆▅▆█▆▆ █
  150 μs        Histogram: log(frequency) by time        227 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark g($A)
BenchmarkTools.Trial: 9411 samples with 1 evaluation.
 Range (min … max):  301.049 μs …   4.224 ms  ┊ GC (min … max): 0.00% … 55.37%
 Time  (median):     393.407 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   527.771 μs ± 346.183 μs  ┊ GC (mean ± σ):  8.61% ± 12.64%

  █▆▅▅▅▄▄▄▄▃▂▃▂▂▂▂▂▁▁▁▁▁▁▁      ▁▁   ▁▂                         ▂
  ██████████████████████████▇██████▇█████████▇▆▇▆▅▆▅▆▆▆▅▅▄▄▄▅▅▄ █
  301 μs        Histogram: log(frequency) by time        1.8 ms <

 Memory estimate: 2.52 MiB, allocs estimate: 6.

julia> @benchmark h($A)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  370.690 μs … 559.212 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     389.377 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   397.357 μs ±  21.079 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▃▇█▆▅▄▃▁                                                  
  ▁▁▂▃▇████████▇▅▄▃▃▃▄▄▅▄▄▄▄▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▂▂▂▁▁▁▁▁▁▁▁▁ ▃
  371 μs           Histogram: frequency by time          472 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark k($A)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  380.104 μs … 556.205 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     394.372 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   401.398 μs ±  17.597 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

        ▃▆█▇▃                                                    
  ▁▁▂▃▄▇█████▇▃▃▂▂▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  380 μs           Histogram: frequency by time          470 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark l($A)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  377.320 μs … 619.899 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     388.873 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   395.194 μs ±  16.974 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▄█▇▄▁                                                     
  ▁▁▂▅██████▆▄▃▃▂▂▂▂▃▂▂▂▃▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  377 μs           Histogram: frequency by time          463 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark f($A)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  145.030 μs … 290.142 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     156.375 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   163.418 μs ±  18.701 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▇█▆▃▁▄▄▃▃▃▂▇▃▃▂▁▃▃▂▂▁▂▁▁▂▁▁▁   ▁                 ▄▃        ▂
  ▃▃▁█████████████████████████████████████▇█▇█▇▇█▇▆▇▆▇██▇▆▆▇▇██ █
  145 μs        Histogram: log(frequency) by time        225 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

tho my problem that it makes no sense that it's faster than manual iteration :/

Jun 26 '22 00:06 SobhanMP

Codecov Report

Attention: Patch coverage is 96.29630% with 1 line in your changes missing coverage. Please review.

Project coverage is 92.14%. Comparing base (f51b64c) to head (2e16fcc). Report is 199 commits behind head on main.

Files	Patch %	Lines
src/sparsematrix.jl	94.73%	1 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #167      +/-   ##
==========================================
+ Coverage   92.11%   92.14%   +0.02%     
==========================================
  Files          11       11              
  Lines        7203     7230      +27     
==========================================
+ Hits         6635     6662      +27     
  Misses        568      568

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Jun 26 '22 00:06 codecov-commenter

This is great, I have a prototype version based on ArrayIteration.jl, but having a version with no dependencies is great and very valuable.

Jun 26 '22 01:06 rayegun

@ViralBShah what do you think about implementing the same for all special matrices (symmetric, transpose, diagonal, etc)? this could give rise to fast generic implementations for many of the functions like dot. if you think it's a good idea, where should i submit a pull request?

Jun 28 '22 03:06 SobhanMP

That should almost certainly be experimented with in a separate package first.

Jun 28 '22 03:06 rayegun

Agree. I would put that in a separate package.

Jun 28 '22 12:06 ViralBShah

i made https://github.com/SobhanMP/SparseExtra.jl for now, guess i'll add a bunch of other stuff while i'm at it...

Jun 29 '22 22:06 SobhanMP

I would love to understand why this is faster? Will we use this iterator across other parts of SparseArrays?

Maybe we revisit once your SparseExtra.jl package has more stuff in it and we can then decide if it makes sense to bring it in here, or register separately.

Jul 12 '22 21:07 ViralBShah

note that it's not faster than the last function (the pattern used almost everywhere in Sparsearrays). accessing via the getcolptr is faster than naming a variable (i don't know why). it could be used to simplify a lot of code. for instance

const S_or_SS{Tv, Ti} = Union{
    AbstractSparseMatrixCSC{Tv, Ti},
    [i{AbstractSparseMatrixCSC{Tv, Ti}} 
        for i in [Symmetric, LowerTriangular, UpperTriangular, Transpose, Adjoint, Hermitian]]...}
function dot(x::AbstractVector, 
    A::S_or_SS{Tv},
    y) where Tv  
    acc = zero(promote_type(eltype(x), eltype(A), eltype(y)))
    for (v, i, j) in iternz(A)
        acc += x[i] * v * y[j]
    end
end

would be a correct implementation for 1 indexed x and y and all sort of special matrix. It could also be used for the dense version but the dense twice as slow.

admittedly, my implementation is missing a lot of the functions for this to work right now, but that's the idea.

Jul 12 '22 21:07 SobhanMP

@ViralBShah i've updated the readme of my repo, feel free to check it out

Jul 17 '22 04:07 SobhanMP

@ViralBShah should this stay in SparseExtra.jl (and close this) or should i make this a pull request (in contrast to a draft)?

this would make closing https://github.com/JuliaSparse/SparseArrays.jl/issues/23, https://github.com/JuliaSparse/SparseArrays.jl/issues/83, and possibly other dot implementation possible. Think transpose of diagonal matrix of sparse vector. It would also simplify every function that has a fast loop over sparse arrays.

would maybe close https://github.com/JuliaSparse/SparseArrays.jl/issues/50 ref https://sobhanmp.github.io/SparseExtra.jl/iternz/

Aug 10 '22 06:08 SobhanMP

If the design is ready, let's bring it in. I will look at the SparseExtra repo soon.

Aug 10 '22 11:08 ViralBShah

@Wimmerer Can you review too? If good enough to bring here, we can do finer reviews in a PR (if necessary).

Aug 10 '22 11:08 ViralBShah

I am persuaded enough by the real nice documentation. Let's open the PR. 1.9 feature freeze may be soon, so let's get it in. Let's also get the design doc in. The real moment of truth will be when we start using it throughout the library.

@fredrikekre @KristofferC @dkarrasch Any comments on the API in https://sobhanmp.github.io/SparseExtra.jl/iternz/, which we are discussing bringing into SparseArrays.jl?

Aug 10 '22 14:08 ViralBShah

I like it. A lite version of ArrayIteration.jl, should simplify the code base a lot.

Thoughts on exporting? I lean towards no for now.

Aug 10 '22 15:08 rayegun

in term of API it's done (at least the outward facing part) but there are three things worth considering.

skip_col/skip_row_to skip to the next element - 1 so that calling next on their returned state gives the desired element.
how about some infrastructure for matrices that are not column major
does it need to support arrays that don't start at one? (for algebraic ones)

Aug 10 '22 16:08 SobhanMP

We can keep it internal for a whole release cycle and export it later. My thought is to get something in that is small and addresses the current data structure, use it widely in the package, and then expand it to be more general.

Aug 10 '22 17:08 ViralBShah

@fredrikekre @KristofferC @dkarrasch Any comments on the API in https://sobhanmp.github.io/SparseExtra.jl/iternz/, which we are discussing bringing into SparseArrays.jl?

Aug 30 '22 16:08 SobhanMP

@ViralBShah @dkarrasch @fredrikekre @KristofferC ping

Sep 19 '22 03:09 SobhanMP

In the link, the API just shows, which feels nice enough to bring in.

all(iternz(x)) do (v, k...)
    x[k...] == v
end

I hadn't seen the rest of the API here https://sobhanmp.github.io/SparseExtra.jl/. I'm not sure if it makes sense to bring all of it into SparseArrays.jl. What's the thinking behind bringing all of it into SparseArrays.jl?

Sep 20 '22 17:09 ViralBShah

I hadn't seen the rest of the API here

i don't think it makes sense to bring that in SparseArrays. the parallel \ is nice but i'm not sure there is any merit in bringing it here.

Sep 20 '22 19:09 SobhanMP

So I finally took a look at SparseExtra.jl. It's very nice that it composes with outer wrappers! What seems to be currently missing is that for Symmetric and Hermitian, the elements of the viewed triangle are transposed and adjoint, respectively. That doesn't show up in cases with real eltypes, but it will show up, for instance, in 3-arg dot with a (complex) Hermitian matrix. Moreover, when you read diagonal elements, you need to take real part in the Hermitian case, and, more generally, call symmetric or hermitian on the underlying element; see

https://github.com/JuliaLang/julia/blob/690a5f67c13fd23c7b48e60c31bfa565c0eee861/stdlib/LinearAlgebra/src/symmetric.jl#L236-L255

Unfortunately, I wasn't able to follow all the details to propose how to fix this. I believe that these are the only two wrappers that have non-trivial manipulations, so when we get those right, this could be ported here and measured for performance.

Also, x-ref https://github.com/JuliaLang/julia/pull/40103.

Jun 19 '23 19:06 dkarrasch

@dkarrasch The links were very helpful. I think that this works as intended now.

julia> A = Symmetric([randn(20, 20) for _ in 1:20, _ in 1:20]);
julia> sum(sum(A[i, j] - v for (v, i, j) in iternz(A))) == 0
true
julia> A = Hermitian([randn(20, 20) + randn(20, 20) * 1im for _ in 1:20, _ in 1:20]);
julia> sum(sum(A[i, j] - v for (v, i, j) in iternz(A))) == 0
true

Are there still interesting in this? Should I update the pull request? Do we want more tests?

Mar 03 '24 05:03 SobhanMP

I thought the main issue here was the significant performance gap. I would be supportive of bringing the iterator in - it is a nice API, but perhaps we can't migrate the internals to it until it is on par or faster. However, it can still be made available to users (with an appropriate note).

@rayegun @dkarrasch @jishnub @fredrikekre Thoughts?

Mar 03 '24 14:03 ViralBShah

I thought the main issue here was the significant performance gap.

I don't think that this is an issue look at the plots in https://sobhanmp.github.io/SparseExtra.jl/iternz/

The loss of performance is fairly small. IMHO, even if we do not replace existing code, we can at least make fast fallbacks for all methods with this.

Mar 07 '24 03:03 SobhanMP

I am onboard with introducing it but would be good at least to hear from @rayegun

Mar 07 '24 14:03 ViralBShah

@rayegun (ping)

May 21 '24 02:05 SobhanMP

SparseArrays.jl SparseArrays.jl copied to clipboard

add the iteratenz API

Codecov Report

SparseArrays.jl
SparseArrays.jl copied to clipboard