TTS-WebUI icon indicating copy to clipboard operation
TTS-WebUI copied to clipboard

VibeVoice error : FlashAttention2 not installed

Open YashMohey opened this issue 4 months ago • 7 comments

I am facing an error when running VibeVoice in TTS.

Logs:

Instantiating VibeVoiceForConditionalGenerationInference model under default dtype torch.bfloat16.
❌ An unexpected error occurred: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.
Traceback (most recent call last):
  File "G:\AI\TTS-WebUI-main\installer_files\env\lib\site-packages\extension_vibevoice\backend_api.py", line 155, in generate_podcast_streaming
    self.load_model()
  File "G:\AI\TTS-WebUI-main\installer_files\env\lib\site-packages\extension_vibevoice\backend_api.py", line 61, in load_model
    self.model = VibeVoiceForConditionalGenerationInference.from_pretrained(
  File "G:\AI\TTS-WebUI-main\installer_files\env\lib\site-packages\transformers\modeling_utils.py", line 279, in _wrapper
    return func(*args, **kwargs)
  File "G:\AI\TTS-WebUI-main\installer_files\env\lib\site-packages\transformers\modeling_utils.py", line 4336, in from_pretrained
    config = cls._autoset_attn_implementation(
  File "G:\AI\TTS-WebUI-main\installer_files\env\lib\site-packages\transformers\modeling_utils.py", line 2109, in _autoset_attn_implementation
    cls._check_and_enable_flash_attn_2(
  File "G:\AI\TTS-WebUI-main\installer_files\env\lib\site-packages\transformers\modeling_utils.py", line 2252, in _check_and_enable_flash_attn_2
    raise ImportError(f"{preface} the package flash_attn seems to be not installed. {install_message}")
ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.

YashMohey avatar Aug 29 '25 02:08 YashMohey

In case it helps, here’s a detailed comment showing how the installation was done on another platform (Windows, ComfyUI venv), from which I obtained the compiled version of FlashAttention 2:

  • https://github.com/kijai/ComfyUI-Florence2/issues/8#issuecomment-2181417300

I used this package — it works for me, but it depends on the exact Python/Torch/CUDA versions you’re using.
For me (Windows, Python 3.10, PyTorch 2.7.0 + CUDA 12.8) this wheel worked:

pip install --no-deps --force-reinstall ^
  https://github.com/kingbri1/flash-attention/releases/download/v2.8.2/flash_attn-2.8.2+cu128torch2.7.0cxx11abiFALSE-cp310-cp310-win_amd64.whl

After that I got No module named 'triton', so I had to install it manually:

pip install -U "triton-windows<3.4"

With those two steps it worked fine for me.

Just make sure the wheel matches your own environment: cpXXX / cu12X / torchX.Y.

JRodrigoTech avatar Aug 31 '25 16:08 JRodrigoTech

@JRodrigoTech Thanks for the post! And yes, currently Python 3.10, PyTorch 2.7.0 + CUDA 12.8 is the "official" version that TTS WebUI uses.

As for triton, I have Version: 3.3.1.post19 from NARI-DIA extension.

For Linux's flash-attn, I used this for previous pytorch version: https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp310-cp310-linux_x86_64.whl But it needs to be updated for 2.7.0

rsxdalv avatar Aug 31 '25 18:08 rsxdalv

I just got the same error while trying vibevoice. I'm on a Mac. Is there a solution?

scalar27 avatar Sep 03 '25 19:09 scalar27

I will add the Mac compatibility features. Thanks for letting me know that there's a Mac user for VibeVoice.

On Wed, Sep 3, 2025, 22:55 scalar27 @.***> wrote:

scalar27 left a comment (rsxdalv/TTS-WebUI#559) https://github.com/rsxdalv/TTS-WebUI/issues/559#issuecomment-3250572067

I just got the same error while trying vibevoice. I'm on a Mac. Is there a solution?

— Reply to this email directly, view it on GitHub https://github.com/rsxdalv/TTS-WebUI/issues/559#issuecomment-3250572067, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTRXI37WUHJIFFK4IYRT3L3Q5BSHAVCNFSM6AAAAACFDFBOGOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTENJQGU3TEMBWG4 . You are receiving this because you commented.Message ID: @.***>

rsxdalv avatar Sep 03 '25 21:09 rsxdalv

Hi, I am getting the same flashattention v2 error on Ubuntu. Any advice on how to fix this? I tried manual install mentioned above but that broke everything

ncoder-ai avatar Sep 09 '25 15:09 ncoder-ai

I am currently running into the same issue on Linux (Arch)

NealimeKenna avatar Oct 09 '25 18:10 NealimeKenna

Same, Ubuntu using docker-compose.

Zorgonatis avatar Nov 09 '25 03:11 Zorgonatis

Is it possible to disable it for now? I can't seem to find the setting to toggle it off. FlashAttention has always been a pain to install.

Yasand123 avatar Dec 11 '25 03:12 Yasand123