Tri Dao

Results 490 comments of Tri Dao

Thanks for this contribution! What happens if user calls a function that's not currently supported (e.g. paged KV or varlen)?

cuda minor version are compatible

@rocking5566 does the AMD version support alibi?

I personally have no bandwidth for that, so we'd need folks to contribute.

What's the difference? The right comparison is (flashattn in fp16 - reference implementation in fp32) vs (rerefnece implementation in fp16 - reference in fp32)

No that's not implemented (one would have to change the backward pass code to compute the gradient of the slopes).