Driss Guessous comments

Results 183 comments of


                                            Driss Guessous

[Performance] [CuDNN-Attention] CuDNN backend should return the output in the same stride order as input Query

@eqy Thanks for staying on top of this I am going to close this issue since the stride PR has landed and we root caused the other slow down.

[Performance] [CuDNN-Attention] CuDNN backend should return the output in the same stride order as input Query

``` ❯ python rc.py ---------------------------------------------SDPA-Flash--------------------------------------------- ALL GOOD ---------------------------------------------SDPA-CuDNN--------------------------------------------- ALL GOOD p% ❯ pip freeze | grep torch -e git+https://github.com/pytorch-labs/attention-gym@2e4d04aa1c500879400ba2547e106f135fd5a4c1#egg=attn_gym pytorch-triton==3.1.0 torch==2.6.0+cu124 # Editable install with no version control (torchao==0.6.1) ~/meta/scripts/sdpa...

Driss Guessous

[Performance] [CuDNN-Attention] CuDNN backend should return the output in the same stride order as input Query

[Performance] [CuDNN-Attention] CuDNN backend should return the output in the same stride order as input Query

[RFC] torchao Contributor Guide

[RFC] torchao Contributor Guide

[FlexFlas] Blackwell fwd support

Invert unary read and write for fusion

[ROCm] float8 does not work

[ROCm] float8 does not work

[ROCm] float8 does not work

[PT2E] observers do not handle inputs with different shapes correctly