ITensors.jl
ITensors.jl copied to clipboard
[NDTensors] cuTENSOR extension
Description
The goal of this PR is to create a cuTENSOR extension to the NDTensors library. In this extension, I will make a function which converts (Dense) Tensors from NDTensors into cuTENSORs and I will create an overload of the NDTensors contract
function that calls the cuTENSOR
backend contract. As a reach goal, in the first effort for the BlockSparse
code I will convert the BS tensors into dense tensors, call contract, construct the output blocksparse tensor and transfer only the non-zero blocks back into a blocksparse tensor. This functionality will be later more robustly solved using efforts from the NVIDIA team.
Checklist:
- [x] It is possible to convert a NDTensors.Tensor into a cuTENSOR
- [x] It is possible to call cuTENSOR based contraction code
- [x] The result from cutensor contract is equivalent to the NDTensors contract
- [x] Create unittests for cuTENSOR extension.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 60.28%. Comparing base (
ceb26a7
) to head (dfc4e7e
). Report is 15 commits behind head on main.
:exclamation: Current head dfc4e7e differs from pull request most recent head 3636813. Consider uploading reports for the commit 3636813 to get more accurate results
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## main #1395 +/- ##
===========================================
+ Coverage 49.23% 60.28% +11.05%
===========================================
Files 110 148 +38
Lines 8320 9757 +1437
===========================================
+ Hits 4096 5882 +1786
+ Misses 4224 3875 -349
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Looks like a good start, nice to see it is pretty simple.
@mtfishman yeah its surprisingly simple! I have started adding cutensor to the NDTensors test suite. So far I have found that the code works for Dense
and Blocksparse
Tensors and ITensors. However, it fails for Diag
because there is no array
function defined for that storagetype. Will work through debugging
There are a few errors in the ITensor/MPS testing because of unsupported mixed type contractions in cutensor. I have opened a bug report in CUDA.jl here
Thanks, seems like we can promote the tensors to a common type ourselves to circumvent that.
@kmp5VT I think we should only send tensors with Dense
storage wrapping CuArray
data to the cuTENSOR
backend, for example:
function NDTensors.contract(
Etensor1::Exposed{<:CuArray,<:DenseTensors},
labelstensor1,
Etensor2::Exposed{<:CuArray,<:DenseTensor},
labelstensor2,
labelsoutput_tensor,
)
# ...
end
For some more context, here is where the dense blocks are contracted when two tensors with BlockSparse
storage are contracted: https://github.com/ITensor/ITensors.jl/blob/v0.4.0/NDTensors/src/blocksparse/contract_generic.jl#L141-L150. R[blockR]
, tensor1[blocktensor1]
, and tensor2[blocktensor2]
are blocks of the block sparse tensor, which are tensors with Dense
storage.
So ideally when block sparse contraction occurs, if the cuTENSOR
backend is enabled those dense block contractions will use dense contraction code defined in this new cuTENSOR
extension. For that to happen, I think we should overload this contract signature:
function NDTensors.contract!(
exposed_tensor_dest::Exposed{<:CuArray,<:DenseTensor},
tensor_dest_labels,
exposed_tensor1::Exposed{<:CuArray,<:DenseTensor},
tensor1_labels,
exposed_tensor2::Exposed{<:CuArray,<:DenseTensor},
tensor2_labels,
α::Number,
β::Number,
)
# Forward contraction to `cuTENSOR`
end
in the package extension as opposed to the out-of-place NDTensors.contract
function.
An issue with adding cuTENSOR to ITensors that I just realized again is that cuTENSOR has a compat restriction on TensorOperations to version 0.7.1. Currently TensorOperations is on version 4.1.1 . This compat restriction causes some of our tests to fail in the TensorAlgebra module. We could do what we are doing with Metal and AMDGPU and do Pkg.add("cuTENSOR")
when we specifically test that package and have a flag in TensorAlgebra
module to disable the tests when cuTENSOR \in ARGS
You say "cuTENSOR has a compat restriction on TensorOperations" but cuTENSOR
is a lower level library that likely doesn't know anything about TensorOperations
, maybe you mean the other way around?
In the TensorOperations.jl
Project.toml
:
https://github.com/Jutho/TensorOperations.jl/blob/master/Project.toml
I see they have a cuTENSOR
extension and the [compat]
entry is set to cuTENSOR = "1"
, while the latest cuTENSOR.jl
version is v2.1.0
(https://github.com/JuliaGPU/CUDA.jl/blob/master/lib/cutensor/Project.toml). Maybe that is the issue you are seeing?
I see there is an open PR about upgrading TensorOperations
to cuTENSOR
v2
here: https://github.com/Jutho/TensorOperations.jl/pull/160.
Also note that cuTENSOR.jl
v2
only supports Julia 1.8 and onward (https://github.com/JuliaGPU/CUDA.jl/blob/v5.3.3/lib/cutensor/Project.toml#L20).
I guess the latest CUDA.jl
version now requires Julia 1.8 and up anyway (https://github.com/JuliaGPU/CUDA.jl/blob/v5.3.3/Project.toml#L80) so we could have the same restriction for NDTensorsCUDAExt
and NDTensorscuTENSORExt
.
I think the best course of action would be something like what you said with manually adding and removing packages as needed in the tests instead of putting them as dependencies in the test Project.toml
.
What we could do is surround any code that relies on TensorOperations
in the tests with Pkg.add("TensorOperations") ... Pkg.rm("TensorOperations")
, and then surround any code that relies on cuTENSOR
with Pkg.add("cuTENSOR") ... Pkg.rm("cuTENSOR")
. Then we should be able to use the latest versions of TensorOperations
and cuTENSOR
in the appropriate parts of the tests. That's similar to your suggestion but would allow us to test TensorAlgebra
even if cuTENSOR
tests are requested. Hopefully that wouldn't be too complicated to set up.
@mtfishman Sorry, I did misunderstand what was going on. Thank you for looking into that and sending me this information!
If we are supporting CUDA
and cuTENSOR
only in julia version 1.8 and up, does that mean we should bump the compat of NDTensors
and ITensors
to julia = 1.8
?
@mtfishman Sorry, I did misunderstand what was going on. Thank you for looking into that and sending me this information! If we are supporting
CUDA
andcuTENSOR
only in julia version 1.8 and up, does that mean we should bump the compat ofNDTensors
andITensors
tojulia = 1.8
?
No, I think that would be pretty extreme. I think the only way to do it would be to require a more recent version of CUDA
that requires Julia 1.8, which would then implicitly only allow users to use the latest NDTensorsCUDAExt
with Julia 1.8. I think that isn't necessary, however, since NDTensorsCUDAExt
appears to work just fine in older versions of Julia, I guess those tests are automatically using an older version of CUDA.jl
which we are compatible with anyway (since we only use pretty high level features of CUDA.jl). So for NDTensorsCUDAExt
I don't think we need to do anything right now.
For cuTENSOR
/NDTensorscuTENSORExt
I assume you will have to write the package extension with a certain cuTENSOR.jl
version in mind, i.e. write it for cuTENSOR
v2, in which case we should put a compat entry for cuTENSOR
of 2
in the NDTensors Project.toml. That will implicitly only allow users to use NDTensorscuTENSORExt
with Julia 1.8 and above.
Right now I have a rudementary Pkg.add
and Pkg.rm
in the TensorAlgebra
test file but it throws some extension compile errors which might be an issue so there might be a better way. I haven't done much yet to enforce the Julia 1.8
version other than add a compat value to the NDTensors
project. I have updated the code to launch CuTensor from the contract!
kernel and have updated the kernels across the library to be able to use CuTensor. Right now for BlockSparse, dense blockwise contractions can cal cuTENSOR based contract however in the DMRG testing I am seeing an internal cuTENSOR error
ERROR: CUTENSORError: an invalid value was used as an argument (code 7, CUTENSOR_STATUS_INVALID_VALUE)
That I am working to remove. Dense contractions are working properly and using cuTENSOR
Sounds good, seems like there are a few wrinkles to work out but mostly coming along.
I see there is an open PR about upgrading
TensorOperations
tocuTENSOR
v2
here: Jutho/TensorOperations.jl#160.Also note that
cuTENSOR.jl
v2
only supports Julia 1.8 and onward (https://github.com/JuliaGPU/CUDA.jl/blob/v5.3.3/lib/cutensor/Project.toml#L20).
Sorry if I repeat things you already know, but let me just give you the state of affairs for that PR here: we have a package extension that works for cuTENSOR v1 in the current version of TensorOperations, but cuTENSOR v2 is actually very breaking, as it renames lots of the functions. I started doing some work on getting the update going, but it is a bit more cumbersome as TensorOperations handles generic StridedView
objects from Strided.jl
. In principle this is now finished, but it requires a large amount of code duplication from cuTENSOR
itself, which specializes it's methods on DenseCuArray
, while in principle this restriction can be loosened. I got in touch with @maleadt to maybe reorganize a bit of cuTENSORs functions, (see https://github.com/JuliaGPU/CUDA.jl/pull/2356), and once that is finalized TensorOperations should be updated soon after.
I have also struggled a bit with the restriction of cuTENSOR v2 with julia v1.8, and I think we more or less decided to also drop the support for julia v1.6-1.7 in the new TensorOperations versions, although this is not set in stone yet.
Thanks for the context @lkdvos. It makes sense to start by improving the cuTENSOR
wrapper code before going through a big refactor of TensorOperations
. This isn't causing serious issues for us, only a few tests of ours rely on TensorOperations
.
Thanks @kmp5VT, this is a great step to get to for our GPU backends!
Can you update the table entry in https://github.com/ITensor/ITensors.jl/blob/v0.5.0/docs/src/RunningOnGPUs.md#gpu-backends?
@mtfishman I added a block sparse (cuTENSOR)
row in the table and put In progress
there since I am working on that view
issue that I found before still and will be making an issue with cutensor when I can isolate the problem.