[BUG] 2.6版本默认绑定flash_atten,无法取消,并且目前并没有提供对应flash_att的版本和安装示例。
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`
demo2.6会有这样的报错
期望行为 | Expected Behavior
期望能够提供取消对flash的绑定,或者提供一个安装教程,以及版本的指定。
复现方法 | Steps To Reproduce
2.6的demo例程。Linux系统。
运行环境 | Environment
Python: 3.10
Transformers: 4.40.0
PyTorch: 2.4.0+cu121
CUDA: 12.2
备注 | Anything else?
No response
I dont understand Chinese but I think we have a similar problem about the flash_attn
I use python 3.11
and install the libraries by order:
numpy==1.24.3 Pillow==10.1.0 torch==2.1.2 torchvision==0.16.2 transformers==4.40.0 sentencepiece==0.1.99 accelerate==0.30.1 bitsandbytes==0.43.1 flash_attn
同学,问题有解决方案吗,我也卡在这里了。我windows,cu117和torch2.1.0没找到对应的flash_atten的包
哈哈,我放弃Windows了,换了一台台式机一把成功:
RTX3090显卡
Ubuntu20.04
Driver Version: 545.23.08
CUDA Version: 12.3
transformers 4.40.0
torch 2.4.0+cu124
torchaudio 2.4.0+cu124
flash-attn 2.6.3
flash安装一把过~~
哈哈,我放弃Windows了,换了一台台式机一把成功: RTX3090显卡 Ubuntu20.04 Driver Version: 545.23.08 CUDA Version: 12.3 transformers 4.40.0 torch 2.4.0+cu124 torchaudio 2.4.0+cu124 flash-attn 2.6.3 flash安装一把过~~
woc,太难了。我貌似没linux
用windows的whl文件和对应的cuda、cudnn、torch版本即可
尝试一下安装flash-attn==1.0.4,我这边可以
在mac下已经解决了,输出博客:https://bothsavage.github.io/article/240810-minicpm2.6
提交pr:https://github.com/OpenBMB/MiniCPM-V/pull/461
修改web_demo_2.6.py文件
# fix the imports
def fixed_get_imports(filename: Union[str, os.PathLike]) -> list[str]:
imports = get_imports(filename)
if not torch.cuda.is_available() and "flash_attn" in imports:
imports.remove("flash_attn")
return imports
。。。。。。。。。。
with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports):
model = AutoModel.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.dtype)
model = model.to(device=device)
Hi Solution for me:
-
removed some packages required versions from requirements.txt and dont forget to remove comment from flash_attn. add bitsandbytes
spacy gradio torch torchvision bitsandbytes flash_attn -
pip install --upgrade pip setuptools wheel # this will solve the torch <-> flash_attn wheel issue. pip install -r requirements.txt Install went fine.
-
pip install torch torchvision --upgrade # needed for me.
this shows: Successfully installed nvidia-cudnn-cu12-9.1.0.70 nvidia-nccl-cu12-2.20.5 torch-2.4.0 torchvision-0.19.0 triton-3.0.0
python web_demo_2.6.py --device cuda
Now webdemo is start to run but got CUDA out of memory. :) (i've 12GB VRAM RTX3060) but this is another problem.
可以把flash atten包直接删了也可以运行,我看到有写其他备选运行方式
I catched the import error, it may like /home/xxx/miniconda3/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE, ref https://github.com/Dao-AILab/flash-attention/issues/919
So I use the flash-attn==2.5.8 when I have torch==2.3.0.
在mac下已经解决了,输出博客:https://bothsavage.github.io/article/240810-minicpm2.6
提交pr:#461
修改web_demo_2.6.py文件
# fix the imports def fixed_get_imports(filename: Union[str, os.PathLike]) -> list[str]: imports = get_imports(filename) if not torch.cuda.is_available() and "flash_attn" in imports: imports.remove("flash_attn") return imports 。。。。。。。。。。 with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports): model = AutoModel.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.dtype) model = model.to(device=device)
It's very useful , thx bro.