Added more sparse functions like: kron, tril, triu, reshape, adjoint, transpose, sparse-sparse multiplication
I added more functions to support the sparse CUDA arrays.
What I added:
kronfunction, which returns the kronecker product of two sparse matrices.trilandtriuwhich return the lower-tringular and upper-triangular of a sparse matrix. I is possible also to insert an offsetk, as usual.- sparse-sparse multiplication, which was missing before, and I think is the most important feature I added.
adjointandtransposefunctions.reshapeof sparse matrices.opnormnow supports also thep=1case.normfunction with all cases (p=0,1,2,..,Inf,-Inf)SparseArrays.droptol!which drops all the elements smaller than a certain quantityexpfunction for matrix exponential. Often very useful for quantum mechanics calculations. It does not use the Pade approximation for several reasons. However this method seems to be very stable and fast.ishermitianandissymmetricsupport. Before it was returning always false for a generic CuSparseMatrix.
About dense matrices, I also added the support for the LinearAlgebra eigen function.
I also resolved a problem related to the copyto!function. Indeed, before the following code was failing
using CUDA
using CUDA.CUSPARSE
using LinearAlgebra
using SparseArrays
a = sprand(ComplexF32, 100, 100, 0.1)
A = CuSparseMatrixCSR(a)
dst = A
src = A + 0.5 * I
copyto!(dst, src)
The reason was that the copyto! function copies the rowPtr CuArrays, which could be of different lengths for the dst and src arrays. Thus, before copying then, I do a resize! of all the quantities of the dst array.
Nice, looking good! I left some comments, and I'll try to do a better review later.
Maybe @amontoison could review this too, since he's working on the CUSPARSE code as well.
CI seems happy 🙂 Thanks for the thorough review @amontoison, and @albertomercurio for keeping up with it! I'll defer for the final approval to @amontoison.
Reposting a reply from @albertomercurio (from a resolved comment):
CUSPARSE has a routine for droptol! (https://docs.nvidia.com/cuda/cusparse/index.html#pruneCsr2csr)
Cool! How to implement it? I saw that there is no julia function.
You'd need to add a wrapper to e.g. conversions.jl.
I just stumbled upon this PR: it looks great! It implements a feature I was talking about in issue #1572, which is a support for LinearAlgebra's eigen function. Just a nitpick though: when calling eigen, the result is a structure called Eigen, and you can acces the vectors through the vectors field, and the values through the values field. Would it be possible to have a similar structure (for example using a NamedTuple)? For example, eigen for Symmetric real matrices would become:
function LinearAlgebra.eigen(A::Symmetric{T,<:CuMatrix}) where {T<:BlasReal}
A2 = copy(A)
vals, vects = syevd!('V', 'U', A2)
(vectors = vects, values = vals)
end
This would give us a nice API, which would be common for GPU and CPU. We could directly call
F = eigen(A)
F.vectors
F.values
whether A is a CPU array or a GPU array.
Why not re-use the Eigen struct? It's parametric: https://github.com/JuliaLang/julia/blob/87fd6d9d0c804e39fbc221d50dad77ec927fa3db/stdlib/LinearAlgebra/src/eigen.jl#L50-L57 (in 1.8, after https://github.com/JuliaLang/julia/pull/42594)
You're right, that would be even better!
Hello @GVigne , thanks for giving this good idea. I should have implemented it by now.
It should be my last wave of comments!
CUSPARSE has a routine for droptol! (https://docs.nvidia.com/cuda/cusparse/index.html#pruneCsr2csr)
Cool! How to implement it? I saw that there is no julia function.
You'd need to add a wrapper to e.g.
conversions.jl.
Hi @maleadt , I saw that only cusparseSpruneCsr2csr and cusparseDpruneCsr2csr are defined. These two functions should correspond only to real matrices, correct?
I saw in the cuSPARSE documentation that there is also the function cusparseHpruneCsr2csr, is this the function for complex matrices? Because, as far as I know, the C and Z letters correspond to complex type.
Am I wrong?
cusparseHpruneCsr2csr is for half precision but I don't know why it's not in libcusparse.jl.
cusparseHpruneCsr2csris for half precision but I don't know why it's not in libcusparse.jl.
The headers probably haven't been updated very recently. That also means that this call isn't going to be available for all versions of the CUDA toolkit, of course.
I can update the headers for 11.8; I'll have a look on Monday.
Hi @amontoison ,
I'm trying to resolve the conflicts possibly due to the merging of the PR #1648 . I noticed that you declared only types <:BlasFloat, and not BlasComplex. Why not putting just where {T}?
Are complex type covered by this change?
https://github.com/JuliaGPU/CUDA.jl/blob/4f9a80696d12ff50622a1ed060e7577baa1f5c82/lib/cusparse/interfaces.jl#L113-L123
BlasFloat is Union{BlasReal, BlasComplex}. :)
I don't use where {T} because other types are not supporter by the geam routine.
BlasFloatisUnion{BlasReal, BlasComplex}. :) I don't usewhere {T}because other types are not supporter by thegeamroutine.
Ok great!
Regarding the native droptol! function, do you think it is better to wait the support also for complex types? Because for the moment cusparseXpruneCsr2csr only supports real matrices, and I think it is not good to have different methods for different types.
What do you think?
Also, for future reference, it's better to rebase on top of instead of merging master.
BlasFloatisUnion{BlasReal, BlasComplex}. :) I don't usewhere {T}because other types are not supporter by thegeamroutine.Ok great!
Regarding the native
droptol!function, do you think it is better to wait the support also for complex types? Because for the momentcusparseXpruneCsr2csronly supports real matrices, and I think it is not good to have different methods for different types.What do you think?
Let's use your generic droptol! for now. We will change the dispatch if they support complex types one day.
Also, for future reference, it's better to rebase on top of instead of merging master.
Ok, do you mean to commit the changes by myself instead of modifying directly on GitHub? Sorry, it is the first time I resolve conflicts.
Anyway, I noticed that after this merging there are some problems with the CuSparseMatrixCSC types. For example
m = 25
n = 35
x = sprand(m,n,0.2)
d_x = CuSparseMatrixCSC(x)
CUDA.@allowscalar Array(d_x[:]) == x[:] # false
works only for CSR types, and it fails with CSC or COO matrices.
Also Several other functions like triu or exp works only for CSR matrices.
I found that types conversions don't work
a = sprand(ComplexF32, 100, 100, 0.02)
A1 = CuSparseMatrixCSC(a)
A2 = CuSparseMatrixCSR(a)
Array(CuSparseMatrixCOO(A1)) ≈ Array(CuSparseMatrixCOO(A2))
Do you think it is my problem?
EDIT
I tried the official master branch and, while the first example code works, the last still does not work.
Also, for future reference, it's better to rebase on top of instead of merging master.
Ok, do you mean to commit the changes by myself instead of modifying directly on GitHub? Sorry, it is the first time I resolve conflicts.
I meant that instead of git merge master you do git rebase master. It doesn't matter here though, we'll squash this PR anyway.
m = 25 n = 35 x = sprand(m,n,0.2) d_x = CuSparseMatrixCSC(x) CUDA.@allowscalar Array(d_x[:]) == x[:] # false
I went throught the problem by using the master branch and adding one by one my lines. I noticed that the problem related to this code regards my reshape definition
Base.reshape(A::$SparseMatrixType{T,M}, dims::NTuple{N,Int}) where {T,N,M} =
$SparseMatrixType( reshape(CuSparseMatrixCOO(A), dims) )
And I think it is related to the fact that the conversion to COO matrix fails also in the master branch.
However, I don' know why Array(d_x[:]) should call my reshape function.
cusparseHpruneCsr2csr will be added by https://github.com/JuliaGPU/CUDA.jl/pull/1649
Ok, but I think that the problem is the conversion from/to COO matrix.
In my functions I often call this conversion and this is why they fails.
Do you see also this problem in your version @amontoison and @maleadt ?
Sorry, my comments was unrelated to the issue you're having, but just in response of what you said earlier.
Ok, but I think that the problem is the conversion from/to COO matrix.
In my functions I often call this conversion and this is why they fails.
Do you see also this problem in your version @amontoison and @maleadt ?
Yes @albertomercurio, I opened a PR to fix the issue (#1655). The conversions between sparse matrices were never tested :grimacing:
Ok, I should be done now.
CI failure is related.
Still a couple of relevant issues. Are you not seeing those locally?
I was convinced that the changes made were good. So I have not tested it on my local setup.
Now it should be ok.
LGTM! Let's finally merge this 🙂 Thanks for the PR!
For future PRs, maybe keep the scope confined a little more so that its easier to review & merge (this one introduces quite some features).