Jiashi Li

Results 4 comments of Jiashi Li

I've made a pull request to flash-attention that enables support for blocked KV cache in flash-decoding which supports MQA. The performance is nearly identical to the original. You might want...

Hi, any updates on Blackwell?

Great to hear! Will it also support the backward pass?