flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

undefined symbol: _ZN3c104cuda9SetDeviceEi

Open JoseGuilherme1904 opened this issue 1 year ago • 12 comments

Sorry, please, I have this error: axolotl$ accelerate launch -m axolotl.cli.train examples/openllama-3b/lora.yml

... ... /.local/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi

JoseGuilherme1904 avatar Oct 20 '23 19:10 JoseGuilherme1904

I have the same error using auto-gptq (built from source)

Cuda 12.2 + python3.9

Vector-Gaming avatar Oct 26 '23 05:10 Vector-Gaming

encouter the same error with

python 3.8 
transformers                 4.34.0
flash-attn                   2.3.2
torch                        2.1.0a0+fe05266
accelerate                   0.23.0

zheyuye avatar Nov 06 '23 07:11 zheyuye

Probably an issue with pytorch version. Can you try pytorch 2.0.0 or 2.1.0?

tridao avatar Nov 06 '23 07:11 tridao

i have the same error, how to solving the error?

Kk1984up avatar Nov 08 '23 06:11 Kk1984up

I ran into this issue and solved it by doing this:

pip install flash_attn -U --force-reinstall

For me this re-installed all the libraries with the versions that flash_attn wants...

lhl avatar Nov 10 '23 12:11 lhl

Reached here with the exact same error trying to use axolotl. I did a fresh virtual env install of axolotl, and still ended up with this error. Using the note from @lhl , this error went away, but I did end up getting a few installation errors:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datasets 2.15.0 requires fsspec[http]<=2023.10.0,>=2023.1.0, but you have fsspec 2023.12.2 which is incompatible.
gcsfs 2023.10.0 requires fsspec==2023.10.0, but you have fsspec 2023.12.2 which is incompatible.
s3fs 2023.10.0 requires fsspec==2023.10.0, but you have fsspec 2023.12.2 which is incompatible.
xformers 0.0.22 requires torch==2.0.1, but you have torch 2.1.1 which is incompatible.

But, FWIW, this particular error went away.

samikrc avatar Dec 12 '23 09:12 samikrc

I used the latest version: "pip install flash-attn --no-build-isolation" and it was solved.

linhx25 avatar Mar 12 '24 08:03 linhx25

I have this problem solved thoroughly. I know why this happened and have found the solutions. The reason it told us that "undefined symbol" of CUDA 9 is a compiller's linker error, which didn't found the approprate symbols from dynamic library(.so files) For this phenomenon, it is CUDA>12. didn't contain CUDA9 symbols(from source building, someone might abbandon the CUDA 9 sources to support the lower versions). To solve this, you need to find CUDA 11.8 python installer *whl. For example, if you found exllamav2_ext.cpython mismatch CUDA9 symbols, you need to install the newest one (normally could be enough). So, you need to change the requirements.txt, the counterpart of the exllamav2_ext.cpython: search this site https://github.com/turboderp/exllamav2/releases/download/ and through search cuda118, you can get this one to replace the old one in the requirments.txt (suppose you're using python==3.11)

https://github.com/turboderp/exllamav2/releases/download/v0.0.20/exllamav2-0.0.20+cu118-cp311-cp311-linux_x86_64.whl

then pip install -r requirments.txt

Also, for flash_rttn: you can replqace this as a example: https://github.com/Dao-AILab/flash-attention/releases/download/v2.5.8/flash_attn-2.5.8+cu118torch2.3cxx11abiFALSE-cp311-cp311-linux_x86_64.whl

he-mark-qinglong avatar May 08 '24 05:05 he-mark-qinglong

I built the flash-attention from the source and then the problem was solved.

git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
MAX_JOBS=4 python setup.py install
pip install -e .

TJKlein avatar Jun 06 '24 22:06 TJKlein

do not use version 2.5.9post1,use version 2.5.8, everything is fine

songhat avatar Jun 18 '24 02:06 songhat

I also used the latest version: "pip install flash-attn --no-build-isolation" and it was solved.

1276440215 avatar Jul 01 '24 07:07 1276440215

I ran into this issue and solved it by doing this:

pip install flash_attn -U --force-reinstall

For me this re-installed all the libraries with the versions that flash_attn wants...

I also usde this ,to solve my problem

rongkunxue avatar Jul 24 '24 03:07 rongkunxue