CUDA based Pytorch Flash Attention is straight up non-functional / non-existent on Windows in *ALL* PyTorch versions above 2.1.2, opening this issue just to remove the weird vagueness around this.
It straight up doesn't work, period, because it's not there, because they're for some reason no longer compiling PyTorch with it on Windows. As it stands currently, you WILL be indefinitely spammed with UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455 unless you manually uninstall the Torch Comfy currently lists in its requirements.txt, and then run pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121 to get back the last one that worked as expected.
It's not at all clear to me why no one has yet pointed out that this isn't a mysterious or vague problem, it's a very obvious problem with a very clear sole cause .
and then run pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121 to get back the last one that worked as expected.
ERROR: Could not find a version that satisfies the requirement torch==2.1.2 (from versions: 2.2.0+cu121, 2.2.1+cu121, 2.2.2+cu121, 2.3.0+cu121) ERROR: No matching distribution found for torch==2.1.2
and then run pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121 to get back the last one that worked as expected.
ERROR: Could not find a version that satisfies the requirement torch==2.1.2 (from versions: 2.2.0+cu121, 2.2.1+cu121, 2.2.2+cu121, 2.3.0+cu121) ERROR: No matching distribution found for torch==2.1.2
I dunno why it's not working for you, in any case assuming Python 3.11 these would be the exact wheels: https://download.pytorch.org/whl/cu121/torch-2.1.2%2Bcu121-cp311-cp311-win_amd64.whl https://download.pytorch.org/whl/cu121/torchvision-0.16.2%2Bcu121-cp311-cp311-win_amd64.whl https://download.pytorch.org/whl/cu121/torchaudio-2.1.2%2Bcu121-cp311-cp311-win_amd64.whl
Pip here should be exactly / specifically ComfyUI embedded Pip, also, of course, not your global system one if it exists
@Akira13641 I came here cause I have the same problem but I am not using ComfyUI but the transformers library. Was trying to load a model in pytorch.
I have python 3.12, maybe thats the problem
@Akira13641 I came here cause I have the same problem but I am not using ComfyUI but the transformers library. Was trying to load a model in pytorch.
I have python 3.12, maybe thats the problem
I'm not sure all this stuff is yet available for Python 3.12 yeah
might have something to do with flash attention 2 not yet officially supporting windows. it can be compiled though for instance see https://www.charlesherring.com/coding/compiling-pytorch-windows11
I have also successfully compiled flash-attn 2.6.1 on windows. It is definitly there (the python_embeded python is the only one on my system, i added it to path, and flash-attn is properly placed in the embeds site-packages folder).
Still. using most recent comfy dev (torch 2.5, cuda 2.4) it straight up refuses to use it. The 1 torch yada error does not go away and clearly it is not using flash attention as inference time and vram usage remains the same.
This error finally popped up for me and completely bricked my ComfyUI. No matter what I do (reinstalls of everything, trying the portable version, etc etc), I keep getting a variety of Torch errors. This 1Torch error, Torch not compiled with CUDA enabled, all sorts. Is there really no reliable fix?
UPDATE: the only way I got it to run was to use the environment of my ReForge installation, which uses pytorch version 2.1.2+cu121. I can not install that pytorch version via the manual command any more, though, because it cannot be found. Guess I'm stuck with this for now.
g:\paperclips\comfy\python_embeded>python -c "import torch; print(torch.backends.cuda.flash_sdp_enabled())"
True
g:\paperclips\comfy\python_embeded>python -m pip show torch
Name: torch
Version: 2.2.2+cu121
g:\paperclips\comfy\python_embeded>python -m pip show flash-attn
Name: flash_attn
Version: 2.6.1
but also getting the 1Torch error spam and generation takes a long time
Same issue, tried to use FLUX today, 12GB VRAM, 32GB RAM. But it straight up declines using flash att leading to 30 min generation time per image (20it). Completely unusable.
I updated my python and everything to try something. Big mistake! I have the same problem. I am new to this as well. My generations were pretty fast with 3090 and 64 ram. I am not sure if it is only related to this but they are so slow now. I tried to make new env with different versions but failed so maybe yeah python upgrade destroyed it, stupid:( Don't touch if it's working right:/:/
Did you solve the problem? I have the same problem and the first images are always produced in a very long time
Did you solve the problem? I have the same problem and the first images are always produced in a very long time
I m using pytorch version: 2.4.0+cu124 xformers version: 0.0.28.dev895
still have this issue... I have to use torch 240 because comfyUI new command --fast only support torch 240. but instead we now have this issue
clip missing: ['text_projection.weight'] E:\ComfyUI-aki-v1.3\comfy\ldm\modules\attention.py:407: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.) out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
Extremely long load time. I reported a new issue here: https://github.com/comfyanonymous/ComfyUI/issues/4663