llm.c
llm.c copied to clipboard
ThunderKittens Backend
Would it be an idea to define the same kernels that exist in the CUDA backend with ThunderKittens as well? They have cool examples with FlashAttention2 and I think it would be interesting to have as an educational resource as well. Thoughts?