Driss Guessous comments

Results 183 comments of


                                            Driss Guessous

Update to new PT Theme

cc @andrewor14 this is the theme update I was talking about

Update to new PT Theme

@andrewor14 Just rebased, lets see if docs populate

[FlexAttention] add support for learnable biases in Inductor

@pytorchbot merge

Question: How to use Float8InferenceLinear with FSDP1/2?

Unfortunately the Float8InferenceLinear is being developed against the latest pytorch nightly and is not very tested on older versions of PyTorch. If it is possible for you to update your...

[MXFP8] unable to run titan llama3 debug model with mxfp8. Assertion: n_rows % max_row_tile_size == 0

https://github.com/pytorch/torchtitan/pull/1208

[Bug] Unusual CPU overhead of SDPA call on H100 on torch nightly

@eqy does CuDNN jit compile for every updated sequence length? That seems non ideal

memory_efficient_attention faster than flash attention 2 backend?

Started to work on the pre-reqs: https://github.com/pytorch/pytorch/pull/143515 But yeah as of right now the most performant kernel we have in PyTorch is the CUDNN backend on h100

memory_efficient_attention faster than flash attention 2 backend?

SDPBackend. CUDNN_ATTENTION is the fastest implementation currently supported for SDPA and is meant for h100 + gpus. For A100 and A10s FAv2 is still your best bet >is much faster...

enable all the most recent ruff linter rules on torchao/float8 code

Looks good, I also hope that this is pretty small PR since we had this enabled previously in fp8 experimental

`set_linter` finds and replaces built-in set in Python code

This linter seems to be unaware of fstrings: ![image](https://github.com/user-attachments/assets/26d4f8ca-a270-4af1-acbe-093b5414a7d3)