Ying Zhang
Ying Zhang
@hwu36 It's fine, I don't need it any more, thx.
Thanks @tenpercent , looks good to me overall, could you share benchmark results? How about compilation time?
@pytorchbot merge
Hi @danthe3rd , this problem should have been solved with the new bwd kernel commit. I tried your script with `assert not dq.isnan().any().item()` and didn't observe errors. Could you check...
Sorry I deleted the original branch by accident. This is the new PR: https://github.com/Dao-AILab/flash-attention/pull/1233.