flash-attention
flash-attention copied to clipboard
How to start learning to manipulate tensor at low-level like flash-attention?
trafficstars
I'm keen to manipulate tensor at C++ and CUDA low level, and I can ask chatGPT to translate/explain C++ line-by-line. I just don't know where to start my learning journey
Triton tutorials are a good place to start to learn about how tensors are laid out in memory, and how to read & write to them. After that you can look at Cutlass.