LoopVectorization.jl icon indicating copy to clipboard operation
LoopVectorization.jl copied to clipboard

@turbo can't find index when looping over sparse arrays

Open jd-lara opened this issue 2 years ago • 3 comments

I am trying to use turbo in this function to multiply A'x when A is sparse

function my_mul_lv!(y::Vector{Float64}, A::SparseMatrixCSC{Float64,Int64}, x::Vector{Float64})
    @turbo for i in eachindex(y) 
        tmp = 0.0
        for j in nzrange(A, i) 
            tmp += A.nzval[j] * x[A.rowval[j]]
        end
        y[i] = tmp
    end
    return
end

The function works without the @turbo macro but when I add it, it throws this an error that i isn't defined: UndefVarError: i not defined

jd-lara avatar Apr 06 '23 15:04 jd-lara

To provide further details if I define

function my_mul_lv!(y::Vector{Float64}, A::SparseMatrixCSC{Float64,Int64}, x::Vector{Float64})
    @turbo for i in eachindex(y) # for each branch
        tmp = 0.0
        nzrange = nzrange(A, i) # non zero bus indices
        for j in nzrange
            tmp += A.nzval[j] * x[A.rowval[j]]
        end
        y[i] = tmp
    end
    return
end

I get the error

ERROR: LoadError: ArgumentError:   `@turbo` only supports loops iterating over ranges, and not objects of type `typeof(nzrange)`.

However,

julia> typeof(nzrange(A, 1))
UnitRange{Int64}

jd-lara avatar Apr 06 '23 15:04 jd-lara

It looks like sparse arrays are currently not supported. This is not surprising, as they lead to irregular memory access.

However, once the random access loads have been performed, the computation (in this case, a multiplication) can still be performed using SIMD instructions.

It would be great to have a special macro for loops over columns of a sparse matrix , that can parallelize those parts that can be. And also send each column to a different thread, if using @tturbo.

ojwoodford avatar Sep 20 '23 09:09 ojwoodford

This is something I'd like to work on eventually in LoopModels, but it'd be a long time. LoopModels is currently very coupled with affine representations of loop nests and indices.

You may be interested in https://github.com/willow-ahrens/Finch.jl

chriselrod avatar Sep 20 '23 16:09 chriselrod