Tri Dao
Tri Dao
it's supported
Idk anything about HF transformers
Yea you can just download the wheel compiled with cuda 12.3. Should be compatible.
Thanks! Is the formatting by black using line length of [100](https://github.com/Dao-AILab/flash-attention/blob/main/flash_attn/pyproject.toml)?
Sorry I've just been busy. Let me take a look tomorrow.
It could be any other code that's hanging.
Seems like a Triton error. You might have better luck searching their repo issues.
4090 is Ada, not Hopper
For those with AMD devices can you help test this PR?
softcapping is not supported yet in the backward pass