HeKa

Beijing Github suspend my old account, so this is my new account. My Gitee url is https://gitee.com/MoFHeka.

Results 17 comments of


HeKa

What's the difference of flash attention implement between cudnn and Dao-AILab?

@mnicely Thank you very much for your answer. May I ask how much improvement has been made compared to Dao-AILab flash attention 2 according to your evaluation?

What's the difference of flash attention implement between cudnn and Dao-AILab?

@mnicely I have noticed that speed-up benchmark at cudnn release note recently. Yes, it looks perfect. But is there any more details for QKV shape and something else. A single...

What's the difference of flash attention implement between cudnn and Dao-AILab?

@gautam20197 head (d) = 128 with any batch size or sequence length?

What's the difference of flash attention implement between cudnn and Dao-AILab?

> I think you can check your use case using the PyTorch nightlies. `pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121` > > And running the PyTorch SDPA example https://pytorch.org/tutorials/intermediate/scaled_dot_product_attention_tutorial.html...

Support for Python 3.10?

Python 3.10 is compatible. You could build TFRA by yourself or just wait some time.

WarmStartHook bug: AttributeError: 'Tensor' object has no attribute '_resource_handle'

Has the bug been fixed?

WarmStartHook bug: AttributeError: 'Tensor' object has no attribute '_resource_handle'

@Mr-Nineteen Could you solve this problem if it is convenient for you?