flash-attention No Module Named 'torch'

trafficstars

When I run pip install flash-attn, it says that. But obviously, it is wrong. See screenshot.

May 28 '23 13:05 MilesQLi

Same issue for me.

May 28 '23 13:05 ulysses500

Workaround: install the previous version pip install flash_attn==1.0.5

May 28 '23 13:05 ulysses500

I am seeing the same problem on every flash_attn version. I am using Cuda 12.1 on the new g2 vm instance from gcp. https://cloud.google.com/compute/docs/accelerator-optimized-machines#g2-vms. The underlying GPU is the Nvidia L4 which uses Ada.

May 28 '23 15:05 smeyerhot

Workaround: install the previous version pip install flash_attn==1.0.5

This might work in some scenarios but not all.

May 28 '23 15:05 smeyerhot

Can you try python -m pip install flash-attn? It's possible that pip and python -m pip refer to different environments.

Getting the dependencies right for all setup is hard. We had torch in the dependency in 1.0.5, but for some users it would download a new version of torch instead of using the existing one. So for 1.0.6 we leave torch out of the dependency.

May 28 '23 16:05 tridao

Getting the same issue. I also tried python -m pip install flash-attn as you suggested with the same failure.

May 28 '23 19:05 official-elinas

same problem here.

May 29 '23 12:05 nrailg

I don't know a right solution that works for all setups, happy to hear suggestions.

We recommend the Pytorch container from Nvidia, which has all the required tools to install FlashAttention.

May 29 '23 14:05 tridao

I believe this is an incompatibility issue with cuda 12.1 version of torch.

Using the following torch version solves my probem.

torch==2.0.0+cu117

May 29 '23 20:05 smeyerhot

@smeyerhot I use the exact version, but it doesn't work. See the screenshot.

May 29 '23 20:05 MilesQLi

@MilesQLi

I believe this is an incompatibility issue with cuda 12.1 version of torch.

Using the following torch version solves my probem.

torch==2.0.0+cu117

Sorry! This didn't fix things... apologies on the false hope.

May 29 '23 21:05 smeyerhot

@smeyerhot No problem. Thanks a lot anyway!

May 29 '23 23:05 MilesQLi

same problem

May 30 '23 03:05 jzsbioinfo

same problem

May 30 '23 03:05 quant-cracker

pip install flash-attn==1.0.5 might help. I am using torch 1.13 and cuda 12.0.

May 30 '23 05:05 leucocyte123

I had the same issue with pip. Workaround was to compile from source, worked as a charm

In [1]: import flash_attn

In [2]: import torch

In [3]: torch.__version__
Out[3]: '2.0.1+cu117'

In [4]: flash_attn.__version__
Out[4]: '1.0.6'

May 30 '23 20:05 Maykeye

I had the same issue with pip. Workaround was to compile from source, worked as a charm
In [1]: import flash_attn

In [2]: import torch

In [3]: torch.__version__
Out[3]: '2.0.1+cu117'

In [4]: flash_attn.__version__
Out[4]: '1.0.6'

I also had the same issue, but my system needs Cuda 12.1 (2x Nvidia L4). so using torch 117 is not an option.

this is also my workaround and it works like a charm.

my system uses Fedora Server

May 30 '23 23:05 Evan-aja

I compiled it myself using a docker container and I still get this when executing RuntimeError: Expected q_dtype == torch::kFloat16 || ((is_sm8x || is_sm90) && q_dtype == torch::kBFloat16) to be true

May 31 '23 00:05 official-elinas

try pip install flash-attn --no-build-isolation fixed my problem.
pip docs to fix this problem, maybe adding torch dependency into pyproject.toml can help

May 31 '23 13:05 xwyzsn

@xwyzsn Unfortunately this only worked on my windows system, not linux. But I feel we're making progress.

May 31 '23 22:05 official-elinas

to fix this problem, maybe adding torch dependency into pyproject.toml can help

We had torch in the dependency in 1.0.5, but for some users it would download a new version of torch instead of using the existing one. I'm not really an expert in Python packaging, so it's possible I'm doing sth wrong.

May 31 '23 23:05 tridao

@xwyzsn Unfortunately this only worked on my windows system, not linux. But I feel we're making progress.

Hi, actually I am using linux. It also worked well. I assume that you may missed some other package to build this up in your linux system.

--no-build-isolation ...Build dependencies specified by PEP 518 must be already installed if this option is used.

Jun 01 '23 07:06 xwyzsn

same problem to me, i solve by check my device and torch cuda version.

Jun 01 '23 14:06 bansky-cl

try pip install flash-attn --no-build-isolation fixed my problem. pip docs to fix this problem, maybe adding torch dependency into pyproject.toml can help

This fixed the torch problem, but now I got an other error. Might be related to something else tho.

        435 |         function(_Functor&& __f)
            |                                                                                                                                                 ^
      /usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
      /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
        530 |         operator=(_Functor&& __f)
            |                                                                                                                                                  ^
      /usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
      error: command '/usr/bin/nvcc' failed with exit code 255

Jun 01 '23 22:06 Wraken

Screenshot 2023-06-02 at 3 50 03 PM

@xwyzsn ninja was removed, then torch was removed, then ninja was re-added. Next logical step is to re-add torch. right??? 😄

Jun 02 '23 22:06 vchiley

Same issue in Kubuntu 20 with torch 2.0.1 and cuda 11.8 python 3.9 / python 3.10 and flash-attn versions 0.2.8 / 1.0.4 / 1.0.5 / 1.0.6 / 1.0.7 with and without --no-build-isolation flag.

Jun 12 '23 11:06 CallShaul

Thanks to the previous answers, I can install it successfully. Here is my experience: Environment torchv2.0.0 + cuda11.7 on Ubuntu

I meet error as ModuleNotFoundError: No module named 'torch', then I install as pip install flash-attn --no-build-isolation
It raises another error as ModuleNotFoundError: No module named 'packaging', then I install this package as pip install packaging
re-run the installation, another error comes RuntimeError: The current installed version of g++ (4.8.5) is less than the minimum required version by CUDA 11.7 (6.0.0). Please make sure to use an adequate version of g++ (>=6.0.0, <12.0).
I use a higher version of g++9.0, and it finally works

Jun 15 '23 16:06 BaohaoLiao

Workaround: install the previous version pip install flash_attn==1.0.5

how do tackle this

Jun 19 '23 04:06 Talkvibes

I had the same issue with pip. Workaround was to compile from source, worked as a charm
In [1]: import flash_attn

In [2]: import torch

In [3]: torch.version

Out[3]: '2.0.1+cu117'

In [4]: flash_attn.version

Out[4]: '1.0.6'
I also had the same issue, but my system needs Cuda 12.1 (2x Nvidia L4). so using torch 117 is not an option.

this is also my workaround and it works like a charm.

my system uses Fedora Server

What was your solution for cuda 12 and L4 gpu?

Jun 19 '23 06:06 smeyerhot

I got the same issue here. I only was able to build from the source (clone the repo then run python setup.py install). pip install git+https://github.com/HazyResearch/flash-attention also give me the same error. I'm using torch==1.12.1+cu113.

Jun 21 '23 17:06 Edresson

flash-attention flash-attention copied to clipboard

No Module Named 'torch'

flash-attention
flash-attention copied to clipboard