xFasterTransformer
xFasterTransformer copied to clipboard
[Kernel] Less compute for Self-Attention (Q * K)