Nvidia
Paul Springer
High-Performance Tensor Transpose library
springer13
Tensor Contraction C++ Library