flash-attention
flash-attention copied to clipboard
No Module Named 'torch'
When I run pip install flash-attn, it says that. But obviously, it is wrong. See screenshot.
Same issue for me.
Workaround: install the previous version pip install flash_attn==1.0.5
I am seeing the same problem on every flash_attn version. I am using Cuda 12.1 on the new g2 vm instance from gcp. https://cloud.google.com/compute/docs/accelerator-optimized-machines#g2-vms. The underlying GPU is the Nvidia L4 which uses Ada.
Workaround: install the previous version pip install flash_attn==1.0.5
This might work in some scenarios but not all.
Can you try python -m pip install flash-attn?
It's possible that pip and python -m pip refer to different environments.
Getting the dependencies right for all setup is hard. We had torch in the dependency in 1.0.5, but for some users it would download a new version of torch instead of using the existing one. So for 1.0.6 we leave torch out of the dependency.
Getting the same issue. I also tried python -m pip install flash-attn as you suggested with the same failure.
same problem here.
I don't know a right solution that works for all setups, happy to hear suggestions.
We recommend the Pytorch container from Nvidia, which has all the required tools to install FlashAttention.
I believe this is an incompatibility issue with cuda 12.1 version of torch.
Using the following torch version solves my probem.
torch==2.0.0+cu117
@smeyerhot I use the exact version, but it doesn't work. See the screenshot.
@MilesQLi
I believe this is an incompatibility issue with cuda 12.1 version of torch.
Using the following torch version solves my probem.
torch==2.0.0+cu117
Sorry! This didn't fix things... apologies on the false hope.
@smeyerhot No problem. Thanks a lot anyway!
same problem
same problem
pip install flash-attn==1.0.5 might help. I am using torch 1.13 and cuda 12.0.
I had the same issue with pip. Workaround was to compile from source, worked as a charm
In [1]: import flash_attn
In [2]: import torch
In [3]: torch.__version__
Out[3]: '2.0.1+cu117'
In [4]: flash_attn.__version__
Out[4]: '1.0.6'
I had the same issue with pip. Workaround was to compile from source, worked as a charm
In [1]: import flash_attn In [2]: import torch In [3]: torch.__version__ Out[3]: '2.0.1+cu117' In [4]: flash_attn.__version__ Out[4]: '1.0.6'
I also had the same issue, but my system needs Cuda 12.1 (2x Nvidia L4). so using torch 117 is not an option.
this is also my workaround and it works like a charm.
my system uses Fedora Server
I compiled it myself using a docker container and I still get this when executing
RuntimeError: Expected q_dtype == torch::kFloat16 || ((is_sm8x || is_sm90) && q_dtype == torch::kBFloat16) to be true
try pip install flash-attn --no-build-isolation fixed my problem.
pip docs
to fix this problem, maybe adding torch dependency into pyproject.toml can help
@xwyzsn Unfortunately this only worked on my windows system, not linux. But I feel we're making progress.
to fix this problem, maybe adding torch dependency into pyproject.toml can help
We had torch in the dependency in 1.0.5, but for some users it would download a new version of torch instead of using the existing one. I'm not really an expert in Python packaging, so it's possible I'm doing sth wrong.
@xwyzsn Unfortunately this only worked on my windows system, not linux. But I feel we're making progress.
Hi, actually I am using linux. It also worked well. I assume that you may missed some other package to build this up in your linux system.
--no-build-isolation...Build dependencies specified by PEP 518 must be already installed if this option is used.
same problem to me, i solve by check my device and torch cuda version.
try
pip install flash-attn --no-build-isolationfixed my problem. pip docs to fix this problem, maybe adding torch dependency into pyproject.toml can help
This fixed the torch problem, but now I got an other error. Might be related to something else tho.
435 | function(_Functor&& __f)
| ^
/usr/include/c++/11/bits/std_function.h:435:145: note: ‘_ArgTypes’
/usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
530 | operator=(_Functor&& __f)
| ^
/usr/include/c++/11/bits/std_function.h:530:146: note: ‘_ArgTypes’
error: command '/usr/bin/nvcc' failed with exit code 255
@xwyzsn ninja was removed, then torch was removed, then ninja was re-added. Next logical step is to re-add torch. right??? 😄
Same issue in Kubuntu 20 with torch 2.0.1 and cuda 11.8 python 3.9 / python 3.10 and flash-attn versions 0.2.8 / 1.0.4 / 1.0.5 / 1.0.6 / 1.0.7 with and without --no-build-isolation flag.
Thanks to the previous answers, I can install it successfully. Here is my experience: Environment torchv2.0.0 + cuda11.7 on Ubuntu
- I meet error as
ModuleNotFoundError: No module named 'torch', then I install aspip install flash-attn --no-build-isolation - It raises another error as
ModuleNotFoundError: No module named 'packaging', then I install this package aspip install packaging - re-run the installation, another error comes
RuntimeError: The current installed version of g++ (4.8.5) is less than the minimum required version by CUDA 11.7 (6.0.0). Please make sure to use an adequate version of g++ (>=6.0.0, <12.0). - I use a higher version of g++9.0, and it finally works
Workaround: install the previous version pip install flash_attn==1.0.5
I had the same issue with pip. Workaround was to compile from source, worked as a charm
In [1]: import flash_attn
In [2]: import torch
In [3]: torch.version
Out[3]: '2.0.1+cu117'
In [4]: flash_attn.version
Out[4]: '1.0.6'
I also had the same issue, but my system needs Cuda 12.1 (2x Nvidia L4). so using torch 117 is not an option.
this is also my workaround and it works like a charm.
my system uses Fedora Server
What was your solution for cuda 12 and L4 gpu?
I got the same issue here. I only was able to build from the source (clone the repo then run python setup.py install). pip install git+https://github.com/HazyResearch/flash-attention also give me the same error. I'm using torch==1.12.1+cu113.