dan_the_3rd

Results 83 comments of dan_the_3rd

> Shapes of q, k, v are torch.Size([2, 8, 2048, 64]) in [b, h, n, d]. Currently, the backward on V100 isn't well parallelised. You will get best performance if...

> Just curious why supporting Alibi is difficult? I noticed that the official flash attention repo doesn't support it either. I don't think it's difficult. It's just some additional work...

We don't plan to implement it ourselves at the moment. However, it seems to be on @tridao 's [roadmap](https://github.com/HazyResearch/flash-attention) > [May 2023] Support attention bias (e.g. ALiBi, relative positional encoding)....

> However, it seems to be on @tridao 's [roadmap](https://github.com/HazyResearch/flash-attention) It looks like it's no longer on the roadmap. On our side, we don't plan to implement that on the...

You should install torch before installing xFormers.

That's definitely not a good situation, but we couldn't find a satisfying solution. The problem with *not* building in the destination environment is that you might end up building xFormers...

You can also use the pypi wheels which are already built: https://pypi.org/project/xformers/#history As the binaries are built with a specific version of pytorch, they have that version pinned as a...

xformers is not compatible with MacOS

Also xFormers now requires PyTorch 2.1+, so you won't be able to run the latest version: https://github.com/facebookresearch/xformers/blob/main/requirements.txt

Hey - just wanted to flag that we have windows builds for cu118 / cu121 (https://github.com/facebookresearch/xformers/tree/main#installing-xformers). The cu121 windows builds now include Flash-Attention, so you shouldn't need to build from...