xformers icon indicating copy to clipboard operation
xformers copied to clipboard

Hackable and optimized Transformers building blocks, supporting a composable construction.

Results 158 xformers issues
Sort by recently updated
recently updated
newest added

# ❓ Questions and Help python -m xformers.info WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.2.0+cu121 with CUDA 1201 (you have 2.2.0+cpu) Python 3.11.7 (you have...

xformers fails with the following error when run with accelerate ``` ValueError: Query/Key/Value should either all have the same dtype, or (in the quantized case) Key/Value should have dtype torch.int32...

# ❓ Questions and Help ![$NTWK%YI _)WB5($}N}95SR](https://github.com/facebookresearch/xformers/assets/155218424/3fd3d380-68ec-41a0-acbf-7f9a0da6f678)

Hi, I followed the instructions given [here](https://github.com/facebookresearch/xformers?tab=readme-ov-file#installing-xformers) to build and install the latest xformers version. More specifically, I run the below command, but it seems that [sequence_parallel_fused kernel](https://github.com/facebookresearch/xformers/commit/342de87b6dcf6f6f1d410823479af0c14aa03317) is not...

bug

On WIndows 10 x64 and Python 3.10.11 and Geforce RTX4080 and Ryzon 5800X3D https://huggingface.co/r4ziel/xformers_pre_built/tree/main/triton-2.0.0-cp310-cp310-win_amd64.whl A matching Triton is not available, some optimizations will not be enabled Traceback (most recent call...

Fixed typos

CLA Signed

I'm converting [hf transformers T5](https://github.com/huggingface/transformers/blob/main/src/transformers/models/t5/modeling_t5.py#L453) to use [memory_efficient_attention](https://facebookresearch.github.io/xformers/components/ops.html)() Reached a point that I'm getting identical results between the original implementation and when using `memory_efficient_attention()`, however, I pass `attn_bias` as a...

Hello, A peer of mine ran the benchmark script on an A100. Under what conditions should we see the most significant gain for the sparse 24 linear or activations? ```...

# 🐛 Bug I am currently experimenting with different scaled dot product attention implementations to evaluate training speed and GPU memory consumption. I compared all methods running the following `train.py`...

When I not installed xformers, I took about 9 hours to training an epoch. But installed xformers0.0.24, it may took 26hours. OS: Linux cuda: 11.8 pytorch: 2.2.0. If I use...