ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

CUDA based Pytorch Flash Attention is straight up non-functional / non-existent on Windows in *ALL* PyTorch versions above 2.1.2, opening this issue just to remove the weird vagueness around this.

Open Akira13641 opened this issue 1 year ago • 12 comments

It straight up doesn't work, period, because it's not there, because they're for some reason no longer compiling PyTorch with it on Windows. As it stands currently, you WILL be indefinitely spammed with UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455 unless you manually uninstall the Torch Comfy currently lists in its requirements.txt, and then run pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121 to get back the last one that worked as expected.

It's not at all clear to me why no one has yet pointed out that this isn't a mysterious or vague problem, it's a very obvious problem with a very clear sole cause .

Akira13641 avatar Apr 27 '24 23:04 Akira13641

and then run pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121 to get back the last one that worked as expected.

ERROR: Could not find a version that satisfies the requirement torch==2.1.2 (from versions: 2.2.0+cu121, 2.2.1+cu121, 2.2.2+cu121, 2.3.0+cu121) ERROR: No matching distribution found for torch==2.1.2

DollarAkshay avatar Apr 29 '24 04:04 DollarAkshay

and then run pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121 to get back the last one that worked as expected.

ERROR: Could not find a version that satisfies the requirement torch==2.1.2 (from versions: 2.2.0+cu121, 2.2.1+cu121, 2.2.2+cu121, 2.3.0+cu121) ERROR: No matching distribution found for torch==2.1.2

I dunno why it's not working for you, in any case assuming Python 3.11 these would be the exact wheels: https://download.pytorch.org/whl/cu121/torch-2.1.2%2Bcu121-cp311-cp311-win_amd64.whl https://download.pytorch.org/whl/cu121/torchvision-0.16.2%2Bcu121-cp311-cp311-win_amd64.whl https://download.pytorch.org/whl/cu121/torchaudio-2.1.2%2Bcu121-cp311-cp311-win_amd64.whl

Pip here should be exactly / specifically ComfyUI embedded Pip, also, of course, not your global system one if it exists

Akira13641 avatar Apr 29 '24 18:04 Akira13641

@Akira13641 I came here cause I have the same problem but I am not using ComfyUI but the transformers library. Was trying to load a model in pytorch.

I have python 3.12, maybe thats the problem

DollarAkshay avatar Apr 29 '24 19:04 DollarAkshay

@Akira13641 I came here cause I have the same problem but I am not using ComfyUI but the transformers library. Was trying to load a model in pytorch.

I have python 3.12, maybe thats the problem

I'm not sure all this stuff is yet available for Python 3.12 yeah

Akira13641 avatar Apr 30 '24 14:04 Akira13641

might have something to do with flash attention 2 not yet officially supporting windows. it can be compiled though for instance see https://www.charlesherring.com/coding/compiling-pytorch-windows11

LucisVivae avatar May 10 '24 12:05 LucisVivae

I have also successfully compiled flash-attn 2.6.1 on windows. It is definitly there (the python_embeded python is the only one on my system, i added it to path, and flash-attn is properly placed in the embeds site-packages folder).

image

Still. using most recent comfy dev (torch 2.5, cuda 2.4) it straight up refuses to use it. The 1 torch yada error does not go away and clearly it is not using flash attention as inference time and vram usage remains the same.

molkemon avatar Jul 19 '24 00:07 molkemon

This error finally popped up for me and completely bricked my ComfyUI. No matter what I do (reinstalls of everything, trying the portable version, etc etc), I keep getting a variety of Torch errors. This 1Torch error, Torch not compiled with CUDA enabled, all sorts. Is there really no reliable fix?

UPDATE: the only way I got it to run was to use the environment of my ReForge installation, which uses pytorch version 2.1.2+cu121. I can not install that pytorch version via the manual command any more, though, because it cannot be found. Guess I'm stuck with this for now.

albozes avatar Aug 05 '24 09:08 albozes

g:\paperclips\comfy\python_embeded>python -c "import torch; print(torch.backends.cuda.flash_sdp_enabled())"
True

g:\paperclips\comfy\python_embeded>python -m pip show torch
Name: torch
Version: 2.2.2+cu121

g:\paperclips\comfy\python_embeded>python -m pip show flash-attn
Name: flash_attn
Version: 2.6.1

but also getting the 1Torch error spam and generation takes a long time

thot-experiment avatar Aug 06 '24 22:08 thot-experiment

Same issue, tried to use FLUX today, 12GB VRAM, 32GB RAM. But it straight up declines using flash att leading to 30 min generation time per image (20it). Completely unusable.

Kusnezow94 avatar Aug 08 '24 07:08 Kusnezow94

I updated my python and everything to try something. Big mistake! I have the same problem. I am new to this as well. My generations were pretty fast with 3090 and 64 ram. I am not sure if it is only related to this but they are so slow now. I tried to make new env with different versions but failed so maybe yeah python upgrade destroyed it, stupid:( Don't touch if it's working right:/:/

Vorzec avatar Aug 08 '24 08:08 Vorzec

Did you solve the problem? I have the same problem and the first images are always produced in a very long time

jstapletton avatar Aug 08 '24 20:08 jstapletton

Did you solve the problem? I have the same problem and the first images are always produced in a very long time

I m using pytorch version: 2.4.0+cu124 xformers version: 0.0.28.dev895

still have this issue... I have to use torch 240 because comfyUI new command --fast only support torch 240. but instead we now have this issue

clip missing: ['text_projection.weight'] E:\ComfyUI-aki-v1.3\comfy\ldm\modules\attention.py:407: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.) out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)

Extremely long load time. I reported a new issue here: https://github.com/comfyanonymous/ComfyUI/issues/4663

LiJT avatar Aug 28 '24 16:08 LiJT