Jiashi Li
Results
4
comments of
Jiashi Li
I've made a pull request to flash-attention that enables support for blocked KV cache in flash-decoding which supports MQA. The performance is nearly identical to the original. You might want...
是 2e-4 还是 2e^-4 呢?
Hi, any updates on Blackwell?
Great to hear! Will it also support the backward pass?