Charlene Yang
Charlene Yang
# What does this PR do ? Update the flash attention section in memory_optimizations.rst **Collection**: [Note which collection this PR will affect] # Changelog - Added more information about flash...
# Description This PR helps resolve issues #614 and #629. Moving forward, we'd like to define attention mask consistently in PyTorch, Jax and Paddle as `True` being masking out the...
# Description This PR adds THD support for fused attention (`F16_arbitrary_seqlen` backend). This feature allows users to run attention for two more cases: ``` case 1: no padding between sequences...
# Description This PR merges `k_channels` and `v_channels` back to `kv_channels` in `DotProductAttention.__init__()` to avoid breaking backward compatibility, but it allows users to pass in a tuple `(int, int)` in...
# Description This PR adds support for `padding`, `padding_causal`, `padding_causal_bottom_right` masks in the PyTorch `UnfusedDotProductAttention` backend. ## Type of change - [ ] Documentation change (change only to the documentation,...
# Description This PR adds support for [flash-attn 3](https://github.com/Dao-AILab/flash-attention). FA3 is still rolling out their code this week, and this PR is a WIP integration of it. ## Type of...
# Description This PR updates some of the scripts in `docs/example/attention` and `benchmarks/attention`. - `docs/example/attention/example_attention.py` - `docs/example/attention/arbitrary_mask_to_post_scale_bias.py` - `benchmarks/attention/benchmark_attention.py` ## Type of change - [x] Documentation change (change only to...
# Description This PR adds a note regarding the change of location for the FP8 metadata `._extra_state` in the checkpoint from TE 1.5 to now. ## Type of change -...
# Description This PR makes a few changes to the FA3 attention path. - Adds `descale_q`, `descale_k` and `descale_v` to FA3 FP8 call. This allows for custom descaling factors instead...
# Description This PR improves the `get_attention_backend()` logic. Fixes #1195 ## Type of change - [ ] Documentation change (change only to the documentation, either a fix or a new...